Unmasking Bias: A Framework for Evaluating Treatment Benefit Predictors Using Observational Studies

Yuan Xia, Mohsen Sadatsafavi, and Paul Gustafson
Abstract

Treatment benefit predictors (TBPs) map patient characteristics into an estimate of the treatment benefit tailored to individual patients, which can support optimizing treatment decisions. However, the assessment of their performance might be challenging with the non-random treatment assignment. This study conducts a conceptual analysis, which can be applied to finite-sample studies. We present a framework for evaluating TBPs using observational data from a target population of interest. We then explore the impact of confounding bias on TBP evaluation using measures of discrimination and calibration, which are the moderate calibration and the concentration of the benefit index (Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT), respectively. We illustrate that failure to control for confounding can lead to misleading values of performance metrics and establish how the confounding bias propagates to an evaluation bias to quantify the explicit bias for the performance metrics. These findings underscore the necessity of accounting for confounding factors when evaluating TBPs, ensuring more reliable and contextually appropriate treatment decisions.


Keywords: calibration; discrimination; confounding bias; precision medicine.

1 Introduction

Precision medicine aims to optimize medical care by tailoring treatment decisions to the unique characteristics of each patient. This objective naturally falls in the intersection between predictive analytics and causal inference; the former aims at predicting the outcome of interest, and the latter seeks to answer counterfactual “what if” questions about the outcome. Most of the progress in predictive analytics has centred around predicting risks. To customize medical treatments, we must shift our focus to predicting treatment benefits. Such prediction is often termed “causal prediction” or “counterfactual prediction” (Prosperi et al.,, 2020). Many studies have investigated whether and how a specific covariate or a set of covariates modifies the treatment benefit, such as Abrevaya et al., (2015), Robertson et al., (2021), and Zhou and Zhu, (2021). We refer to such a function that maps patient characteristics to an estimate of treatment benefit as a treatment benefit predictor (TBP). Before being adopted in patient care, a pre-specified TBP needs to be evaluated (validated) in the target population of interest (la Roi-Teeuw et al.,, 2024).

The validation process for TBPs is currently an active area of research (Kent et al.,, 2020). Traditionally, performance metrics for risk prediction are categorized into measures of overall fit, discrimination, calibration, and clinical utility (net benefit) (Riley et al.,, 2019; Steyerberg,, 2019). Discrimination pertains to the predictive capacity of distinguishing individuals with and without the outcome of interest. Calibration focuses on the proximity of predicted and actual risks. Net benefit assesses the clinical usefulness of a risk prediction algorithm by quantifying the trade-off between the benefits of a true positive classification versus the harms of a false positive one. In the context of treatment benefits prediction, various performance measures for TBPs have been formulated by extending the concepts from risk prediction to the treatment-benefit paradigm. For instance, Vickers et al., (2007) provided an extension of net benefit for TBPs, and van Klaveren et al., (2019); Hoogland et al., (2022) discussed extensions of calibration and discrimination. Efthimiou et al., (2023) amalgamated calibration and discrimination into measures for decision accuracy. However, extending these methods isnt straightforward, since we cannot observe the outcome (treatment benefit) due to the unavailability of the counterfactual. Thus, assessing the performance of TBPs poses a significant challenge.

TPBs can be validated using data from randomized controlled trials (RCTs), where treatment assignment is not systematically confounded. Nevertheless, RCTs from the target population of interest are not always available. Even if available, they are often underpowered to evaluate TBP or lack sufficient follow-up time to elucidate treatment effects on relevant outcomes. For some interventions where equipoise is not established, conducting a RCT might be unethical. Hence, observational studies might give the only opportunity to examine the performance of a TBP in the target population.

Using observational studies adds complexity primarily due to the potential presence of confounding bias, which hinders the identification of treatment benefits. Confounding bias and how it influences estimation of estimands have been extensively studied. For instance, Imbens, (2003) and Veitch and Zaveri, (2020) have investigated the influence of confounding bias on the average treatment effect (ATE). However, it receives less scrutiny in the TBP evaluation. In this study, we show the impact of failing to fully control for confounding on TBP evaluation and offer a comprehensive conceptual evaluation framework applicable to any performance metric. We consider calibration and discrimination and focus on two specific performance metrics as illustrative examples of assessing pre-specified TBPs, via conducting a conceptual analysis.

2 Notation and Assumptions

Each individual in the target population is described by (Y(0),Y(1),A,X,Z)superscript𝑌0superscript𝑌1𝐴𝑋𝑍(Y^{(0)},Y^{(1)},A,X,Z)( italic_Y start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT , italic_Y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , italic_A , italic_X , italic_Z ) with joint distribution \mathbb{P}blackboard_P, where A{0,1}𝐴01A\in\{0,1\}italic_A ∈ { 0 , 1 } is the treatment chosen indicator with A=0𝐴0A=0italic_A = 0 denoting the absence of the treatment and A=1𝐴1A=1italic_A = 1 being the presence of the treatment; Y(a)superscript𝑌𝑎Y^{(a)}italic_Y start_POSTSUPERSCRIPT ( italic_a ) end_POSTSUPERSCRIPT is the counterfactual outcome that would be observed under treatment a𝑎aitalic_a; Xd𝑋superscript𝑑X\in\mathbb{R}^{d}italic_X ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is the set of pre-treatment covariates observable in routine clinical practice and also in observational studies that will be used to predict treatment benefit; and Zp𝑍superscript𝑝Z\in\mathbb{R}^{p}italic_Z ∈ blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT is a distinct set of additional covariates that might be only available in the observational study, and might be needed to control for confounding. For instance, X𝑋Xitalic_X can be blood pressure and age available at the point of care, which are used to predict the benefit of statin therapy for cardiovascular diseases. Meanwhile, Z𝑍Zitalic_Z, socioeconomic status, is a confounding variable but not often used for predicting benefit from statins.

Individual treatment benefit is quantified as B=Y(1)Y(0)𝐵superscript𝑌1superscript𝑌0B=Y^{(1)}-Y^{(0)}italic_B = italic_Y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT - italic_Y start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT, which is unobservable. For instance, when an individual has received A=0𝐴0A=0italic_A = 0, the corresponding outcome Y(1)superscript𝑌1Y^{(1)}italic_Y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT remains unobserved (and therefore counterfactual). The conditional mean outcome under treatment a𝑎aitalic_a is denoted as μa(x,z)=E[Y(a)X=x,Z=z]subscript𝜇𝑎𝑥𝑧Econditionalsuperscript𝑌𝑎𝑋𝑥𝑍𝑧\mu_{a}(x,z)=\operatorname{E}[Y^{(a)}\mid X=x,Z=z]italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_x , italic_z ) = roman_E [ italic_Y start_POSTSUPERSCRIPT ( italic_a ) end_POSTSUPERSCRIPT ∣ italic_X = italic_x , italic_Z = italic_z ]. We also denote τ(x,z)=E[BX=x,Z=z]𝜏𝑥𝑧Econditional𝐵𝑋𝑥𝑍𝑧\tau(x,z)=\operatorname{E}[B\mid X=x,Z=z]italic_τ ( italic_x , italic_z ) = roman_E [ italic_B ∣ italic_X = italic_x , italic_Z = italic_z ] and τs(x)=E[BX=x]subscript𝜏𝑠𝑥Econditional𝐵𝑋𝑥\tau_{s}(x)=\operatorname{E}[B\mid X=x]italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) = roman_E [ italic_B ∣ italic_X = italic_x ]. Typically, τ(x,z)𝜏𝑥𝑧\tau(x,z)italic_τ ( italic_x , italic_z ) is referred to as the conditional average treatment effect (CATE), as {X,Z}𝑋𝑍\{X,Z\}{ italic_X , italic_Z } is the entire input covariate space from \mathbb{P}blackboard_P. However, when our focus is on a subset of covariates X{X,Z}𝑋𝑋𝑍X\subset\{X,Z\}italic_X ⊂ { italic_X , italic_Z }, we concentrate on τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ). The ATE is τ=E[B]superscript𝜏E𝐵\tau^{*}=\operatorname{E}[B]italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_E [ italic_B ].

A TBP denoted as h(x)𝑥h(x)italic_h ( italic_x ) predicts the benefit of an active treatment of interest based on known patient characteristics X𝑋Xitalic_X in routine clinical practice. It can guide treatment decision-making, for example the care provider offering treatment only to those with h(x)>0𝑥0h(x)>0italic_h ( italic_x ) > 0. We denote the predicted treatment benefit from h(x)𝑥h(x)italic_h ( italic_x ) as H:=h(X)assign𝐻𝑋H:=h(X)italic_H := italic_h ( italic_X ) and its cumulative distribution function (CDF) as FH()subscript𝐹𝐻F_{H}(\cdot)italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( ⋅ ). The best possible TBP is τs(x)subscript𝜏𝑠𝑥\tau_{s}(x)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) itself, and the corresponding prediction is τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) with CDF Fτs(X)()subscript𝐹subscript𝜏𝑠𝑋F_{\tau_{s}(X)}(\cdot)italic_F start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) end_POSTSUBSCRIPT ( ⋅ ).

This study is motivated by the question: how can h(x)𝑥h(x)italic_h ( italic_x ) be evaluated using representative observational data from the target population where treatment is confounded? In an observational study, we can observe iid draws of (Y,A,X,Z)𝑌𝐴𝑋𝑍(Y,A,X,Z)( italic_Y , italic_A , italic_X , italic_Z ) from the joint distribution obssubscript𝑜𝑏𝑠\mathbb{P}_{obs}blackboard_P start_POSTSUBSCRIPT italic_o italic_b italic_s end_POSTSUBSCRIPT, which is a consequence of \mathbb{P}blackboard_P. To evaluate h(x)𝑥h(x)italic_h ( italic_x ), the following three assumptions are universally required: (1) no interference: between any two individuals, the treatment taken by one does not affect the counterfactual outcomes of the other; (2) consistency: the counterfactual outcome under the observed treatment assignment equals the observed outcome Y𝑌Yitalic_Y, i.e., Y=Y(1)A+Y(0)(1A)𝑌superscript𝑌1𝐴superscript𝑌01𝐴Y=Y^{(1)}A+Y^{(0)}(1-A)italic_Y = italic_Y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT italic_A + italic_Y start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( 1 - italic_A ); and (3) conditional exchangeability: the treatment assignment is independent of the counterfactual outcomes, given the set of variables XZ𝑋𝑍X\cup Zitalic_X ∪ italic_Z, i.e., AY(0),Y(1)XZA\perp\!\!\!\perp Y^{(0)},Y^{(1)}\mid X\cup Zitalic_A ⟂ ⟂ italic_Y start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT , italic_Y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ∣ italic_X ∪ italic_Z. The first two assumptions are known as the stable-unit-treatment-value assumption (SUTVA) (Rubin,, 1980). The last one assumes no unmeasured confounders given X𝑋Xitalic_X and Z𝑍Zitalic_Z.

3 Performance Metrics

In this study, we explore two specific metrics, each corresponding to one of these aspects of performance within the population-level framework. This framework enables us to conceptually understand how observational data can identify the performance of h()h(\cdot)italic_h ( ⋅ ) and to explore the extent of potential misguidance when failing to control for confounding.

3.1 Calibration

Van Calster et al., (2016) proposed a hierarchical definition of calibration for risk prediction models. In what follows, we focus on what they named ‘moderate calibration’: that the expected value of the outcome among individuals with the same predicted risk is equal to the predicted risk. They argue that moderate calibration is the most desired form of calibration. Similarly, in treatment benefit prediction, a TBP h(x)𝑥h(x)italic_h ( italic_x ) can be considered moderately calibrated if E[BH]=HEconditional𝐵𝐻𝐻\operatorname{E}[B\mid H]=Hroman_E [ italic_B ∣ italic_H ] = italic_H. It says that the average treatment benefit among all patients with predicted treatment benefit H=h𝐻H=hitalic_H = italic_h equals hhitalic_h, for any hhitalic_h. For example, if h(x)𝑥h(x)italic_h ( italic_x ) is moderately calibrated and predicts a group of individuals to have H=0.5𝐻0.5H=0.5italic_H = 0.5, we should expect that the average treatment benefit within the group is also 0.50.50.50.5. Furthermore, h(x)𝑥h(x)italic_h ( italic_x ) is strongly calibrated if τs(X)=Hsubscript𝜏𝑠𝑋𝐻\tau_{s}(X)=Hitalic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) = italic_H.

Calibration of TBPs can also be visualized in a calibration plot (Van Calster et al.,, 2019). The calibration plot compares E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] against H𝐻Hitalic_H, with a moderately calibrated TBP showing points aligned around the diagonal identity line.

3.2 Discrimination

In risk prediction, we assess discrimination using either concordance measures or measures of disparity. The c-statistic and the Gini index are examples of concordance and disparity measures, respectively. Both metrics have been extended to in the field of treatment benefit prediction. However, it has been established that the c-for-benefit (van Klaveren et al.,, 2018), analogous to the c-statistic for TBPs, does not qualify as a proper scoring rule (Xia et al.,, 2023). Therefore, we shift our focus to the concentration of the benefit index (Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT), a single-value summary of the difference in average treatment benefit between two treatment assignment rules: ‘treat at random’ and ‘treat greater H𝐻Hitalic_H(Sadatsafavi et al.,, 2020).

With i.i.d. copies {(B1,H1),(B2,H2)}subscript𝐵1subscript𝐻1subscript𝐵2subscript𝐻2\{(B_{1},H_{1}),(B_{2},H_{2})\}{ ( italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , ( italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) } of (B,H)𝐵𝐻(B,H)( italic_B , italic_H ), the Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT of H𝐻Hitalic_H is defined as:

Cb=1E[B1]E[B1I(H1H2)+B2I(H1<H2)],subscript𝐶𝑏1Esubscript𝐵1Esubscript𝐵1𝐼subscript𝐻1subscript𝐻2subscript𝐵2𝐼subscript𝐻1subscript𝐻2C_{b}=1-\frac{\operatorname{E}[B_{1}]}{\operatorname{E}[B_{1}I(H_{1}\geq H_{2}% )+B_{2}I(H_{1}<H_{2})]},italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = 1 - divide start_ARG roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] end_ARG start_ARG roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] end_ARG , (1)

where I()𝐼I(\cdot)italic_I ( ⋅ ) is an indicator function. The denominator in (1) operationalizes the strategy of ‘treat greater H𝐻Hitalic_H’ among two patients randomly selected from the population. If the two patients have the same H𝐻Hitalic_H, we randomly assign treatment to a patient. When τ>0superscript𝜏0\tau^{*}>0italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT > 0 and h(x)𝑥h(x)italic_h ( italic_x ) is at least not worse than ‘treat at random,’ the Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT value ranges from 00 to 1111. If 0Cb10subscript𝐶𝑏10\leq C_{b}\leq 10 ≤ italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ≤ 1, ‘treat at random’ is associated with a Cb×100%subscript𝐶𝑏percent100C_{b}\times 100\%italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT × 100 % reduction in expected benefit compared with ‘treat greater H𝐻Hitalic_H.’

The Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT connects to a Gini-like coefficient determined by twice the area between the line of independence and the relative concentration curve (RCC) of B𝐵Bitalic_B concerning H𝐻Hitalic_H. The RCC orders patients by H𝐻Hitalic_H and plots the cumulative B𝐵Bitalic_B value divided by τsuperscript𝜏\tau^{*}italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT (Yitzhaki and Olkin,, 1991). With the Gini-like coefficient denoted as GinibsubscriptGini𝑏\text{Gini}_{b}Gini start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT can be alternately defined as:

Cb=Ginib1+Ginib.subscript𝐶𝑏subscriptGini𝑏1subscriptGini𝑏C_{b}=\frac{\text{Gini}_{b}}{1+\text{Gini}_{b}}.italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = divide start_ARG Gini start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG start_ARG 1 + Gini start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_ARG .

To eliminate the necessity of contemplating patient pairs to ascertain the expectation in (1), we establish that

E[B1I(H1H2)+B2I(H1<H2)]=2E[BFH(H)]E[BfH(H)],Esubscript𝐵1𝐼subscript𝐻1subscript𝐻2subscript𝐵2𝐼subscript𝐻1subscript𝐻22E𝐵subscript𝐹𝐻𝐻E𝐵subscript𝑓𝐻𝐻\operatorname{E}[B_{1}I(H_{1}\geq H_{2})+B_{2}I(H_{1}<H_{2})]=2\operatorname{E% }[BF_{H}(H)]-\operatorname{E}[Bf_{H}(H)],roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] , (2)

where fH(h):=P(H=h)assignsubscript𝑓𝐻P𝐻f_{H}(h):=\operatorname{P}(H=h)italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) := roman_P ( italic_H = italic_h ), denoting the probability of H𝐻Hitalic_H taking the specific value hhitalic_h. Thus, when H𝐻Hitalic_H is continuous, E[BfH(H)]=0E𝐵subscript𝑓𝐻𝐻0\operatorname{E}[Bf_{H}(H)]=0roman_E [ italic_B italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] = 0. Although H𝐻Hitalic_H is continuous in most applications, it is helpful to derive the general expression to create simple, illustrative examples where H𝐻Hitalic_H is discrete. As the original publication on Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT did not discuss estimation in the presence of ties, we elaborate on this point in Appendix B. Note that (2) enables us to concentrate on B𝐵Bitalic_B and FHsubscript𝐹𝐻F_{H}italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT (and fHsubscript𝑓𝐻f_{H}italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT if needed) for computing Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, where H𝐻Hitalic_H acts as a ranking variable. Consequently, Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT provides insights into the effectiveness of h()h(\cdot)italic_h ( ⋅ ) in mimicking the CATE function through its ranking ability.

4 Evaluating TBP Performance in Presence of Confounding

When using observational data to evaluate a pre-specified TBP h(x)𝑥h(x)italic_h ( italic_x ), we consider both X𝑋Xitalic_X and Z𝑍Zitalic_Z to adjust for confounding, even though h(X)𝑋h(X)italic_h ( italic_X ) is solely a function of X𝑋Xitalic_X. To evaluate the moderate calibration of any h(X)𝑋h(X)italic_h ( italic_X ) in the target population, we need to determine E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] (i.e., the calibration curve). We address the confounding variables {X,Z}𝑋𝑍\{X,Z\}{ italic_X , italic_Z } by initially focussing on τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) instead of τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ). Afterward, E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] can be obtained by taking the average of τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) conditional on H𝐻Hitalic_H:

E[BH]=E[τ(X,Z)H].Econditional𝐵𝐻Econditional𝜏𝑋𝑍𝐻\operatorname{E}[B\mid H]=\operatorname{E}[\tau(X,Z)\mid H].roman_E [ italic_B ∣ italic_H ] = roman_E [ italic_τ ( italic_X , italic_Z ) ∣ italic_H ] . (3)

Similarly, to compute Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT of h(x)𝑥h(x)italic_h ( italic_x ), we determine τsuperscript𝜏\tau^{*}italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and E[BFH(H)]E𝐵subscript𝐹𝐻𝐻\operatorname{E}[BF_{H}(H)]roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] (and E[BfH(H)]E𝐵subscript𝑓𝐻𝐻\operatorname{E}[Bf_{H}(H)]roman_E [ italic_B italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] if needed) in (2) by assessing τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) as well. Consequently, Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT can be expressed as

Cb=1E[τ(X,Z)]E[τ(X,Z)η(H)],subscript𝐶𝑏1E𝜏𝑋𝑍E𝜏𝑋𝑍𝜂𝐻C_{b}=1-\frac{\operatorname{E}[\tau(X,Z)]}{\operatorname{E}[\tau(X,Z)\eta(H)]},italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = 1 - divide start_ARG roman_E [ italic_τ ( italic_X , italic_Z ) ] end_ARG start_ARG roman_E [ italic_τ ( italic_X , italic_Z ) italic_η ( italic_H ) ] end_ARG , (4)

where η(H)=2FH(H)fH(H)𝜂𝐻2subscript𝐹𝐻𝐻subscript𝑓𝐻𝐻\eta(H)=2F_{H}(H)-f_{H}(H)italic_η ( italic_H ) = 2 italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) - italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ).

Note that τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) plays a vital role in the determination of both E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT. Various approaches are available to determine τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ), with two main methods being outcome regressions and inverse probability weighting methods (Rosenbaum and Rubin,, 1983). For instance, with the outcome models μa(x,z)=E[YA=a,X=x,Z=z]subscript𝜇𝑎𝑥𝑧Econditional𝑌𝐴𝑎𝑋𝑥𝑍𝑧\mu_{a}(x,z)=\operatorname{E}[Y\mid A=a,X=x,Z=z]italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_x , italic_z ) = roman_E [ italic_Y ∣ italic_A = italic_a , italic_X = italic_x , italic_Z = italic_z ], we have τ(x,z)=μ1(x,z)μ0(x,z)𝜏𝑥𝑧subscript𝜇1𝑥𝑧subscript𝜇0𝑥𝑧\tau(x,z)=\mu_{1}(x,z)-\mu_{0}(x,z)italic_τ ( italic_x , italic_z ) = italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x , italic_z ) - italic_μ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x , italic_z ). With the propensity score e(x,z)=P(A=1X=x,Z=z)𝑒𝑥𝑧P𝐴conditional1𝑋𝑥𝑍𝑧e(x,z)=\operatorname{P}(A=1\mid X=x,Z=z)italic_e ( italic_x , italic_z ) = roman_P ( italic_A = 1 ∣ italic_X = italic_x , italic_Z = italic_z ), we have τ(x,z)=E[Y(Ae(x,z))e(x,z)(1e(x,z))X=x,Z=z]𝜏𝑥𝑧Econditional𝑌𝐴𝑒𝑥𝑧𝑒𝑥𝑧1𝑒𝑥𝑧𝑋𝑥𝑍𝑧\tau(x,z)=\operatorname{E}\left[\frac{Y(A-e(x,z))}{e(x,z)(1-e(x,z))}\mid X=x,Z% =z\right]italic_τ ( italic_x , italic_z ) = roman_E [ divide start_ARG italic_Y ( italic_A - italic_e ( italic_x , italic_z ) ) end_ARG start_ARG italic_e ( italic_x , italic_z ) ( 1 - italic_e ( italic_x , italic_z ) ) end_ARG ∣ italic_X = italic_x , italic_Z = italic_z ]. These two approaches are equivalent in the population-level framework as long as the overlap assumption holds. The overlap assumption says that the conditional probability of receiving the active treatment or not is bounded away from 00 and 1111, i.e., 0<e(x,z)<10𝑒𝑥𝑧10<e(x,z)<10 < italic_e ( italic_x , italic_z ) < 1, for all possible x𝑥xitalic_x and z𝑧zitalic_z. However, variations may emerge when considering specific finite-sample estimating techniques associated with each. We will return to the finite-sample estimation of the calibration curve and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT in Section 7.

If we treat the observational data as if they arose from a RCT, or if we do not sufficiently control for confounding, confounding bias may emerge. Thus, it is essential to investigate the potential confounding biases and grasp how lack of full control might affect the accuracy of our evaluations. In this study, we focus on the confounding bias that occurs when X𝑋Xitalic_X alone is not sufficient to control for confounding and denote the confounding bias as a function of X𝑋Xitalic_X. For X=x𝑋𝑥X=xitalic_X = italic_x, we have

bias(x)=(E[YA=1,X=x]E[YA=0,X=x])τs(x).bias𝑥Econditional𝑌𝐴1𝑋𝑥Econditional𝑌𝐴0𝑋𝑥subscript𝜏𝑠𝑥\text{bias}(x)=\left(\operatorname{E}[Y\mid A=1,X=x]-\operatorname{E}[Y\mid A=% 0,X=x]\right)-\tau_{s}(x).bias ( italic_x ) = ( roman_E [ italic_Y ∣ italic_A = 1 , italic_X = italic_x ] - roman_E [ italic_Y ∣ italic_A = 0 , italic_X = italic_x ] ) - italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) . (5)

To illustrate the propagation of bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) to performance metrics, we denote the inaccurate E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT calculated without controlling for Z𝑍Zitalic_Z as E~[BH]~𝐸delimited-[]conditional𝐵𝐻\tilde{E}[B\mid H]over~ start_ARG italic_E end_ARG [ italic_B ∣ italic_H ] and C~bsubscript~𝐶𝑏\tilde{C}_{b}over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, respectively.

The bias function of the calibration curve and the bias of Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT are influenced by and could be different from bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) (5). For the calibration curve, the deviation from the accurate assessment can be expressed as

E~[BH=h]E[BH=h]=E[bias(X)H=h],~Edelimited-[]conditional𝐵𝐻Econditional𝐵𝐻Econditionalbias𝑋𝐻\tilde{\operatorname{E}}[B\mid H=h]-\operatorname{E}[B\mid H=h]=\operatorname{% E}[\text{bias}(X)\mid H=h],over~ start_ARG roman_E end_ARG [ italic_B ∣ italic_H = italic_h ] - roman_E [ italic_B ∣ italic_H = italic_h ] = roman_E [ bias ( italic_X ) ∣ italic_H = italic_h ] , (6)

which is a function of H𝐻Hitalic_H. It depends on bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) and the association between H𝐻Hitalic_H and X𝑋Xitalic_X.

For Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, the confounding bias affects the calculation of both τsuperscript𝜏\tau^{*}italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and E[Bη(X)]E𝐵𝜂𝑋\operatorname{E}[B\eta(X)]roman_E [ italic_B italic_η ( italic_X ) ]. The discrepancy from actual τsuperscript𝜏\tau^{*}italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is E[bias(X)]Ebias𝑋\operatorname{E}[\text{bias}(X)]roman_E [ bias ( italic_X ) ], while the deviation from E[Bη(H)]E𝐵𝜂𝐻\operatorname{E}[B\eta(H)]roman_E [ italic_B italic_η ( italic_H ) ] is E[bias(X)η(H))]\operatorname{E}[\text{bias}(X)\eta(H))]roman_E [ bias ( italic_X ) italic_η ( italic_H ) ) ]. However, expressing the deviation from the true Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT is complex as it involves the difference between two ratios. The deviation is in the form of:

C~bCb=τE[bias(X)η(H)]E[Bη(H)]E[bias(X)]E[Bη(H)](E[Bη(H)]+E[bias(X)η(H)]).subscript~𝐶𝑏subscript𝐶𝑏superscript𝜏Ebias𝑋𝜂𝐻E𝐵𝜂𝐻Ebias𝑋E𝐵𝜂𝐻E𝐵𝜂𝐻Ebias𝑋𝜂𝐻\tilde{C}_{b}-C_{b}=\frac{\tau^{*}\operatorname{E}[\text{bias}(X)\eta(H)]-% \operatorname{E}[B\eta(H)]\operatorname{E}[\text{bias}(X)]}{\operatorname{E}[B% \eta(H)](\operatorname{E}[B\eta(H)]+\operatorname{E}[\text{bias}(X)\eta(H)])}.over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT - italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = divide start_ARG italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT roman_E [ bias ( italic_X ) italic_η ( italic_H ) ] - roman_E [ italic_B italic_η ( italic_H ) ] roman_E [ bias ( italic_X ) ] end_ARG start_ARG roman_E [ italic_B italic_η ( italic_H ) ] ( roman_E [ italic_B italic_η ( italic_H ) ] + roman_E [ bias ( italic_X ) italic_η ( italic_H ) ] ) end_ARG . (7)

This value not only depends on bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) but also on τsuperscript𝜏\tau^{*}italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, η(H)𝜂𝐻\eta(H)italic_η ( italic_H ), and E[Bη(H)]E𝐵𝜂𝐻\operatorname{E}[B\eta(H)]roman_E [ italic_B italic_η ( italic_H ) ].

According to the biases (6) and (7), the deviations in moderate calibration and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT may yield zero value(s) even with non-zero bias(X)bias𝑋\text{bias}(X)bias ( italic_X ). In particular, zero deviations in moderate calibration would occur when E[bias(X)H=h]=0Econditionalbias𝑋𝐻0\operatorname{E}[\text{bias}(X)\mid H=h]=0roman_E [ bias ( italic_X ) ∣ italic_H = italic_h ] = 0 for all hhitalic_h, and zero deviation in Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT would occur when τE[bias(X)η(H)]=E[bias(X)]E[Bη(H)]superscript𝜏Ebias𝑋𝜂𝐻Ebias𝑋E𝐵𝜂𝐻\tau^{*}\operatorname{E}[\text{bias}(X)\eta(H)]=\operatorname{E}[\text{bias}(X% )]\operatorname{E}[B\eta(H)]italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT roman_E [ bias ( italic_X ) italic_η ( italic_H ) ] = roman_E [ bias ( italic_X ) ] roman_E [ italic_B italic_η ( italic_H ) ]. In the nest section, we further investigate these biases in several illustrative examples to demonstrate how bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) influences the evaluation results.

5 Examples Relevant to Confounding Bias in Evaluation

In this section, we establish two synthetic populations to illustrate the impact of confounding bias on evaluating given TBPs in the population-level framework. The first population describes a linear τ(x,z)𝜏𝑥𝑧\tau(x,z)italic_τ ( italic_x , italic_z ) function with binary outcome and covariates, which enables exploring to what extent the strength of confounding affects the bias of both metrics. The second population has a non-linear τ(x,z)𝜏𝑥𝑧\tau(x,z)italic_τ ( italic_x , italic_z ) function with continuous outcome and covariates. This offers flexibility in defining τs(x)subscript𝜏𝑠𝑥\tau_{s}(x)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ), τ(x,z)𝜏𝑥𝑧\tau(x,z)italic_τ ( italic_x , italic_z ) and propensity score function e(x,z)𝑒𝑥𝑧e(x,z)italic_e ( italic_x , italic_z ). Unlike prior confounding bias studies, we investigate the propagation of confounding bias to E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT for both populations, where both are determined in closed-form.

5.1 Population 1. Binary Outcome and Covariates

Assume the dimensions of two sets of covariates are d=2𝑑2d=2italic_d = 2 and p=1𝑝1p=1italic_p = 1, and all Y,A,X1,X2𝑌𝐴subscript𝑋1subscript𝑋2Y,A,X_{1},X_{2}italic_Y , italic_A , italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and Z𝑍Zitalic_Z are binary. In this binary outcome setup, the individual treatment benefit B{1,0,1}𝐵101B\in\{-1,0,1\}italic_B ∈ { - 1 , 0 , 1 }. We assume Y(0)Y(1)X,ZY^{(0)}\perp\!\!\!\perp Y^{(1)}\mid X,Zitalic_Y start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ⟂ ⟂ italic_Y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ∣ italic_X , italic_Z and the distribution \mathbb{P}blackboard_P is in the form of

(Y(0)X,Z;α0)conditionalsuperscript𝑌0𝑋𝑍subscript𝛼0\displaystyle(Y^{(0)}\mid X,Z;\alpha_{0})( italic_Y start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ∣ italic_X , italic_Z ; italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) Bernoulli(α00+α01X1+α02X2+α03Z),similar-toabsentBernoullisubscript𝛼00subscript𝛼01subscript𝑋1subscript𝛼02subscript𝑋2subscript𝛼03𝑍\displaystyle\sim\text{Bernoulli}(\alpha_{00}+\alpha_{01}X_{1}+\alpha_{02}X_{2% }+\alpha_{03}Z),∼ Bernoulli ( italic_α start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT 02 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT italic_Z ) ,
(Y(1)X,Z;α1)conditionalsuperscript𝑌1𝑋𝑍subscript𝛼1\displaystyle(Y^{(1)}\mid X,Z;\alpha_{1})( italic_Y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ∣ italic_X , italic_Z ; italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) Bernoulli(α10+α11X1+α12X2+α13Z),similar-toabsentBernoullisubscript𝛼10subscript𝛼11subscript𝑋1subscript𝛼12subscript𝑋2subscript𝛼13𝑍\displaystyle\sim\text{Bernoulli}(\alpha_{10}+\alpha_{11}X_{1}+\alpha_{12}X_{2% }+\alpha_{13}Z),∼ Bernoulli ( italic_α start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_Z ) ,
(AX,Z;β)conditional𝐴𝑋𝑍𝛽\displaystyle(A\mid X,Z;\beta)( italic_A ∣ italic_X , italic_Z ; italic_β ) Bernoulli(β0+β1Z),similar-toabsentBernoullisubscript𝛽0subscript𝛽1𝑍\displaystyle\sim\text{Bernoulli}(\beta_{0}+\beta_{1}Z),∼ Bernoulli ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_Z ) ,
(X,Zp)𝑋conditional𝑍𝑝\displaystyle(X,Z\mid p)( italic_X , italic_Z ∣ italic_p ) Multivariate Bernoulli(p000(1x1)(1x2)(1z)+\displaystyle\sim\text{Multivariate Bernoulli}\big{(}p_{000}^{(1-x_{1})(1-x_{2% })(1-z)}+∼ Multivariate Bernoulli ( italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ( 1 - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ( 1 - italic_z ) end_POSTSUPERSCRIPT +
p001(1x1)(1x2)z+p010(1x1)x2(1z)+p100x1(1x2)(1z)+superscriptsubscript𝑝0011subscript𝑥11subscript𝑥2𝑧superscriptsubscript𝑝0101subscript𝑥1subscript𝑥21𝑧limit-fromsuperscriptsubscript𝑝100subscript𝑥11subscript𝑥21𝑧\displaystyle p_{001}^{(1-x_{1})(1-x_{2})z}+p_{010}^{(1-x_{1})x_{2}(1-z)}+p_{1% 00}^{x_{1}(1-x_{2})(1-z)}+italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ( 1 - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_z end_POSTSUPERSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 - italic_z ) end_POSTSUPERSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ( 1 - italic_z ) end_POSTSUPERSCRIPT +
p011(1x1)x2z+p101x1(1x2)z+p110x1x2(1z)+p111x1x2z),\displaystyle p_{011}^{(1-x_{1})x_{2}z}+p_{101}^{x_{1}(1-x_{2})z}+p_{110}^{x_{% 1}x_{2}(1-z)}+p_{111}^{x_{1}x_{2}z}\big{)},italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_z end_POSTSUPERSCRIPT + italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_z end_POSTSUPERSCRIPT + italic_p start_POSTSUBSCRIPT 110 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 - italic_z ) end_POSTSUPERSCRIPT + italic_p start_POSTSUBSCRIPT 111 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_z end_POSTSUPERSCRIPT ) ,

where μa(X,Z)subscript𝜇𝑎𝑋𝑍\mu_{a}(X,Z)italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_X , italic_Z ) is a linear combination of A𝐴Aitalic_A, X𝑋Xitalic_X, and Z𝑍Zitalic_Z. Distribution \mathbb{P}blackboard_P leads to a linear τ(x,z)𝜏𝑥𝑧\tau(x,z)italic_τ ( italic_x , italic_z ), which is

τ(X,Z)𝜏𝑋𝑍\displaystyle\tau(X,Z)italic_τ ( italic_X , italic_Z ) =(α10α00)+(α11α01)X1+(α12α02)X2+(α13α03)Z,absentsubscript𝛼10subscript𝛼00subscript𝛼11subscript𝛼01subscript𝑋1subscript𝛼12subscript𝛼02subscript𝑋2subscript𝛼13subscript𝛼03𝑍\displaystyle=(\alpha_{10}-\alpha_{00})+(\alpha_{11}-\alpha_{01})X_{1}+(\alpha% _{12}-\alpha_{02})X_{2}+(\alpha_{13}-\alpha_{03})Z,= ( italic_α start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ) + ( italic_α start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT ) italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ( italic_α start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 02 end_POSTSUBSCRIPT ) italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + ( italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT ) italic_Z ,

and τs(X)=E[τ(X,Z)X]subscript𝜏𝑠𝑋Econditional𝜏𝑋𝑍𝑋\tau_{s}(X)=\operatorname{E}[\tau(X,Z)\mid X]italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) = roman_E [ italic_τ ( italic_X , italic_Z ) ∣ italic_X ].

There are 18181818 parameters to capture the relationship between outcome, treatment, and covariates, with constraints imposed on all α𝛼\alphaitalic_α, β𝛽\betaitalic_β, and p𝑝pitalic_p to ensure legitimate distributions. The linear propensity score function is e(X,Z)=β0+β1Z𝑒𝑋𝑍subscript𝛽0subscript𝛽1𝑍e(X,Z)=\beta_{0}+\beta_{1}Zitalic_e ( italic_X , italic_Z ) = italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_Z. The strength of confounding is determined by the values of β1,α03subscript𝛽1subscript𝛼03\beta_{1},\alpha_{03}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT and α13subscript𝛼13\alpha_{13}italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT.

We formulate three TBPs: h1(x1,x2)subscript1subscript𝑥1subscript𝑥2h_{1}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) is the mean of covariates, h2(x1,x2)subscript2subscript𝑥1subscript𝑥2h_{2}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) is designed to be moderately calibrated, and h3(x1,x2)subscript3subscript𝑥1subscript𝑥2h_{3}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) is designed to be strongly calibrated, by carefully choosing coefficients. The expressions for these three TBPs are as follows:

h1(x1,x2)subscript1subscript𝑥1subscript𝑥2\displaystyle h_{1}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) =x1+x22,absentsubscript𝑥1subscript𝑥22\displaystyle=\frac{x_{1}+x_{2}}{2},= divide start_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ,
h2(x1,x2)subscript2subscript𝑥1subscript𝑥2\displaystyle h_{2}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) =b0+b1(x1+x2)+b2x1x2,absentsubscript𝑏0subscript𝑏1subscript𝑥1subscript𝑥2subscript𝑏2subscript𝑥1subscript𝑥2\displaystyle=b_{0}+b_{1}(x_{1}+x_{2})+b_{2}x_{1}x_{2},= italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ,
h3(x1,x2)subscript3subscript𝑥1subscript𝑥2\displaystyle h_{3}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) =c0+c1x1+c2x2+c3x1x2,absentsubscript𝑐0subscript𝑐1subscript𝑥1subscript𝑐2subscript𝑥2subscript𝑐3subscript𝑥1subscript𝑥2\displaystyle=c_{0}+c_{1}x_{1}+c_{2}x_{2}+c_{3}x_{1}x_{2},= italic_c start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_c start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ,

where blsubscript𝑏𝑙b_{l}italic_b start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and cksubscript𝑐𝑘c_{k}italic_c start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, l=0,1,2𝑙012l=0,1,2italic_l = 0 , 1 , 2 and k=0,1,2,3𝑘0123k=0,1,2,3italic_k = 0 , 1 , 2 , 3 are coefficients. Moreover, h3(x1,x2):=τs(x)assignsubscript3subscript𝑥1subscript𝑥2subscript𝜏𝑠𝑥h_{3}(x_{1},x_{2}):=\tau_{s}(x)italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) := italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ), which is an bijective function, uniquely mapping the four distinct values of (x1,x2)subscript𝑥1subscript𝑥2(x_{1},x_{2})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) to four unique prediction values, exhibiting strong calibration, and thus surpassing all potential TBPs. (See Appendix A for the detailed definitions of the coefficients.)

5.2 Population 2. Continuous Outcome and Covariates

We still assume d=2𝑑2d=2italic_d = 2 and p=1𝑝1p=1italic_p = 1, but let X1,X2,Zsubscript𝑋1subscript𝑋2𝑍X_{1},X_{2},Zitalic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_Z be independent, with each following the uniform distribution on the interval [0,1]01[0,1][ 0 , 1 ]. Adopting the setup proposed by Foster and Syrgkanis, (2023),

(AX,Z)Bernoulli(e(X,Z)),similar-toconditional𝐴𝑋𝑍Bernoulli𝑒𝑋𝑍\displaystyle(A\mid X,Z)\sim\text{Bernoulli}(e(X,Z)),( italic_A ∣ italic_X , italic_Z ) ∼ Bernoulli ( italic_e ( italic_X , italic_Z ) ) ,
(Y(a)X,Z)N(τs(X)(a0.5)+b(X,Z),σ2),similar-toconditionalsuperscript𝑌𝑎𝑋𝑍Nsubscript𝜏𝑠𝑋𝑎0.5𝑏𝑋𝑍superscript𝜎2\displaystyle(Y^{(a)}\mid X,Z)\sim\text{N}\left(\tau_{s}(X)\left(a-0.5\right)+% b(X,Z),\sigma^{2}\right),( italic_Y start_POSTSUPERSCRIPT ( italic_a ) end_POSTSUPERSCRIPT ∣ italic_X , italic_Z ) ∼ N ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ( italic_a - 0.5 ) + italic_b ( italic_X , italic_Z ) , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ,

where the conditional independence is assumed, i.e., Y(0)Y(1)X,ZY^{(0)}\perp\!\!\!\perp Y^{(1)}\mid X,Zitalic_Y start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ⟂ ⟂ italic_Y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ∣ italic_X , italic_Z. Note that X𝑋Xitalic_X and Z𝑍Zitalic_Z contribute to explaining the outcome and treatment assignment. Let σ=0.1𝜎0.1\sigma=0.1italic_σ = 0.1 and consider simple functions: propensity score function e(X,Z)=Z𝑒𝑋𝑍𝑍e(X,Z)=Zitalic_e ( italic_X , italic_Z ) = italic_Z and base response function b(X,Z)=max(Z,X2)+0.1X1𝑏𝑋𝑍𝑍subscript𝑋20.1subscript𝑋1b(X,Z)=\max\left(Z,X_{2}\right)+0.1X_{1}italic_b ( italic_X , italic_Z ) = roman_max ( italic_Z , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + 0.1 italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. We define the ensuing CATE function and TBP as a selected exemplification:

τs(X)subscript𝜏𝑠𝑋\displaystyle\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) =max(X1,X2),absentsubscript𝑋1subscript𝑋2\displaystyle=\max\left(X_{1},X_{2}\right),= roman_max ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ,
h(X)𝑋\displaystyle h(X)italic_h ( italic_X ) =X1+X2.absentsubscript𝑋1subscript𝑋2\displaystyle=X_{1}+X_{2}.= italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT .

The predicted treatment benefit H𝐻Hitalic_H is a sum of two i.i.d uniform random variables on [0,1]01[0,1][ 0 , 1 ], which follows a triangular distribution with parameters: lower limit a=0𝑎0a=0italic_a = 0, upper limit b=2,𝑏2b=2,italic_b = 2 , and mode c=1𝑐1c=1italic_c = 1.

The variable Z𝑍Zitalic_Z is independent of B𝐵Bitalic_B conditioning on X𝑋Xitalic_X (i.e., τ(X,Z)=τs(X)𝜏𝑋𝑍subscript𝜏𝑠𝑋\tau(X,Z)=\tau_{s}(X)italic_τ ( italic_X , italic_Z ) = italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X )). Therefore, Z𝑍Zitalic_Z is not an effect modifier but a confounding variable. Note that τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ), the maximum of these two uniform random variables, follows Beta(2,1)Beta21\text{Beta}(2,1)Beta ( 2 , 1 ). Hence, the population average treatment benefit is τ=2/3superscript𝜏23\tau^{*}=2/3italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 2 / 3.

6 Metrics Performance in Two Synthetic Populations

Refer to caption
Figure 1: Calibration plots and relative concentration curves (RCC) for h1(X1,X2)subscript1subscript𝑋1subscript𝑋2h_{1}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), h2(X1,X2)subscript2subscript𝑋1subscript𝑋2h_{2}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), and h3(X1,X2)subscript3subscript𝑋1subscript𝑋2h_{3}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) when β1=0.7621subscript𝛽10.7621\beta_{1}=0.7621italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.7621. The three plots on the left-hand side demonstrate calibration plots. The three plots on the right-hand side are RCCs. The blue dotted curves refer to the E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, and the red dashed curves refer to E~[BH]~Edelimited-[]conditional𝐵𝐻\tilde{\operatorname{E}}[B\mid H]over~ start_ARG roman_E end_ARG [ italic_B ∣ italic_H ] and C~bsubscript~𝐶𝑏\tilde{C}_{b}over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT.

For the first synthetic population, we employed a specific set of parameters to compare evaluation results for TBPs with and without controlling for Z𝑍Zitalic_Z. This selection serves as just one instance among numerous potential examples:

α0subscript𝛼0\displaystyle\alpha_{0}italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT =(0.629,0.143,0.479,0.058),absent0.6290.1430.4790.058\displaystyle=(0.629,0.143,-0.479,-0.058),= ( 0.629 , 0.143 , - 0.479 , - 0.058 ) ,
α1subscript𝛼1\displaystyle\alpha_{1}italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =(0.335,0.304,0.334,0.314),absent0.3350.3040.3340.314\displaystyle=(0.335,0.304,-0.334,0.314),= ( 0.335 , 0.304 , - 0.334 , 0.314 ) ,
p𝑝\displaystyle pitalic_p =(p111,p110,p101,p100,p011,p010,p001,p000)absentsubscript𝑝111subscript𝑝110subscript𝑝101subscript𝑝100subscript𝑝011subscript𝑝010subscript𝑝001subscript𝑝000\displaystyle=(p_{111},p_{110},p_{101},p_{100},p_{011},p_{010},p_{001},p_{000})= ( italic_p start_POSTSUBSCRIPT 111 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 110 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT )
=(0.181,0.100,0.035,0.148,0.174,0.087,0.121,0.153),absent0.1810.1000.0350.1480.1740.0870.1210.153\displaystyle=(0.181,0.100,0.035,0.148,0.174,0.087,0.121,0.153),= ( 0.181 , 0.100 , 0.035 , 0.148 , 0.174 , 0.087 , 0.121 , 0.153 ) ,
β𝛽\displaystyle\betaitalic_β =(0.120,β1),absent0.120subscript𝛽1\displaystyle=(0.120,\beta_{1}),= ( 0.120 , italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ,

where values in p𝑝pitalic_p were randomly generated but represent a valid joint distribution of (X1,X2,Z)subscript𝑋1subscript𝑋2𝑍(X_{1},X_{2},Z)( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_Z ). Upon establishing these parameters, we define the target population and consequently determine bias(X)bias𝑋\text{bias}(X)bias ( italic_X ), which is greater than 00 for all X=x𝑋𝑥X=xitalic_X = italic_x and a non-linear function of X𝑋Xitalic_X. In this setup, the value of bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) is influenced not only by the three parameters that determine confounding strength but also by other additional parameters; see Appendix A for further discussion of bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) under various “strength of confounding.” We then evaluate the three pre-specified TBPs using the calibration curve and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT with and without confounding bias. Note that h1(x1,x2)subscript1subscript𝑥1subscript𝑥2h_{1}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) and h2(x1,x2)subscript2subscript𝑥1subscript𝑥2h_{2}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) exhibit analogous mapping patterns: each maps four unique combinations of (x1,x2)subscript𝑥1subscript𝑥2(x_{1},x_{2})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) to three distinct H𝐻Hitalic_H values. We compute E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT through closed-form expressions and calculate E~[BH]~Edelimited-[]conditional𝐵𝐻\tilde{\operatorname{E}}[B\mid H]over~ start_ARG roman_E end_ARG [ italic_B ∣ italic_H ] and C~bsubscript~𝐶𝑏\tilde{C}_{b}over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT either via (6) and (7), or by using the inaccurate τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) instead of τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) in (3) and (4).

When β1=0.762subscript𝛽10.762\beta_{1}=0.762italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.762, the evaluation results are illustrated in Figure 1, where the three plots on the left display the moderate calibration curves of h1(X1,X2)subscript1subscript𝑋1subscript𝑋2h_{1}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), h2(X1,X2)subscript2subscript𝑋1subscript𝑋2h_{2}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), and h3(X1,X2)subscript3subscript𝑋1subscript𝑋2h_{3}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). We see that h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and h3subscript3h_{3}italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT are moderately calibrated, aligning closely with the 45-degree line. Additionally, while h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT lacks moderate calibration, its predictions are positively associated with E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ]. Figure 1 further highlights distinct disparities between E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and E~[BH]~Edelimited-[]conditional𝐵𝐻\tilde{\operatorname{E}}[B\mid H]over~ start_ARG roman_E end_ARG [ italic_B ∣ italic_H ], particularly noting that E[bias(X)H]>0Econditionalbias𝑋𝐻0\operatorname{E}[\text{bias}(X)\mid H]>0roman_E [ bias ( italic_X ) ∣ italic_H ] > 0 for all three TBPs due to bias(X)>0bias𝑋0\text{bias}(X)>0bias ( italic_X ) > 0. Consequently, the failure to control for confounding variables results in an inaccurate calibration assessment.

The three plots on the right in Figure 1 show the RCCs and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT values for the three TBPs. The RCCs and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT for h1(X1,X2)subscript1subscript𝑋1subscript𝑋2h_{1}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) and h2(X1,X2)subscript2subscript𝑋1subscript𝑋2h_{2}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) are identical. It is because the CDFs of H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and H2subscript𝐻2H_{2}italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are the same, resulting in the two TBPs identically ranking patients. Note that the optimal TBP h3(X1,X2)subscript3subscript𝑋1subscript𝑋2h_{3}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) yields a slightly larger Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT and GinibsubscriptGini𝑏\text{Gini}_{b}Gini start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, compared to h1(X1,X2)subscript1subscript𝑋1subscript𝑋2h_{1}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) and h2(X1,X2)subscript2subscript𝑋1subscript𝑋2h_{2}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). It reflects that h3(X1,X2)subscript3subscript𝑋1subscript𝑋2h_{3}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) yields a more effective treatment assignment rule, leading to a larger average treatment benefit. However, C~bsubscript~𝐶𝑏\tilde{C}_{b}over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT is smaller than Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT for all three TBPs.

When β1=0subscript𝛽10\beta_{1}=0italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0, all blue dotted curves align with the red dashed curves for both performance metrics because Z𝑍Zitalic_Z is no longer a confounding variable. Particularly in the calibration plots for h2(X1,X2)subscript2subscript𝑋1subscript𝑋2h_{2}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) and h3(X1,X2)subscript3subscript𝑋1subscript𝑋2h_{3}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), all curves lie on the 45-degree diagonal line. (See Figure 5 in Appendix A for the corresponding calibration plots, Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, and RCCs.) The findings highlight the importance of controlling confounding variables when conducting evaluations in observational studies to obtain accurate results. Ignoring confounding variables can produce misleading patterns with different extents depending on the strength and direction of associations.

In the second synthetic population, with the actual τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ), we compute the calibration curve E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT by initially deriving the closed-form joint distribution of (τs(X),H)subscript𝜏𝑠𝑋𝐻(\tau_{s}(X),H)( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) , italic_H ) and then analyzing the corresponding closed-form conditional distribution of (τs(X)H)conditionalsubscript𝜏𝑠𝑋𝐻(\tau_{s}(X)\mid H)( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ∣ italic_H ). Similarly, we compare E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT with E~[BH]~Edelimited-[]conditional𝐵𝐻\tilde{\operatorname{E}}[B\mid H]over~ start_ARG roman_E end_ARG [ italic_B ∣ italic_H ] and C~bsubscript~𝐶𝑏\tilde{C}_{b}over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT. (See Appendix C for calculation details.) In this setup, bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) is a function of X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT: bias(x)=1/3x22+2/3x23bias𝑥13superscriptsubscript𝑥2223superscriptsubscript𝑥23\text{bias}(x)=1/3-x_{2}^{2}+2/3x_{2}^{3}bias ( italic_x ) = 1 / 3 - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 / 3 italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT, which is illustrated in Figure 2.

Refer to caption
Figure 2: The confounding bias function, bias(X)bias𝑋\text{bias}(X)bias ( italic_X ), which is only a function of X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Setting 2.

The average treatment benefit conditioned on the predicted treatment benefit from h(X1,X2)subscript𝑋1subscript𝑋2h(X_{1},X_{2})italic_h ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) is E[BH]=0.5+HEconditional𝐵𝐻0.5𝐻\operatorname{E}[B\mid H]=0.5+Hroman_E [ italic_B ∣ italic_H ] = 0.5 + italic_H. Using the second treatment assignment rule based on H𝐻Hitalic_H, the average treatment benefit is 2E[BFH(H)]=0.78332E𝐵subscript𝐹𝐻𝐻0.78332\operatorname{E}[BF_{H}(H)]=0.78332 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] = 0.7833, slightly exceeding the population average τ=2/3superscript𝜏23\tau^{*}=2/3italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 2 / 3. The Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT for h(X)𝑋h(X)italic_h ( italic_X ) is 0.14890.14890.14890.1489. When assigning treatment based on the predicted treatment benefit from τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ), the average treatment benefit from the second treatment assignment rule is 2E[BFτs(X)(τs(X))]=0.82E𝐵subscript𝐹subscript𝜏𝑠𝑋subscript𝜏𝑠𝑋0.82\operatorname{E}[BF_{\tau_{s}(X)}(\tau_{s}(X))]=0.82 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ) ] = 0.8. In other words, the second treatment assignment rule, based on the actual E[BX]Econditional𝐵𝑋\operatorname{E}[B\mid X]roman_E [ italic_B ∣ italic_X ], does not exhibit a significant improvement in average treatment benefit compared to H𝐻Hitalic_H. The corresponding Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT for τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) is 0.16670.16670.16670.1667.

Refer to caption
Figure 3: Calibration plot and relative concentration curve (RCC) for Setting 2.

However, confounding bias causes a deviation from the actual E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ], as depicted by the step function:

E~[BH]E[BH]={1/(6h),0<h1,1/(6(2h)),1h<2.~Edelimited-[]conditional𝐵𝐻Econditional𝐵𝐻cases160116212\displaystyle\tilde{\operatorname{E}}[B\mid H]-\operatorname{E}[B\mid H]=% \begin{cases}1/(6h),&0<h\leq 1,\\ 1/(6(2-h)),&1\leq h<2.\end{cases}over~ start_ARG roman_E end_ARG [ italic_B ∣ italic_H ] - roman_E [ italic_B ∣ italic_H ] = { start_ROW start_CELL 1 / ( 6 italic_h ) , end_CELL start_CELL 0 < italic_h ≤ 1 , end_CELL end_ROW start_ROW start_CELL 1 / ( 6 ( 2 - italic_h ) ) , end_CELL start_CELL 1 ≤ italic_h < 2 . end_CELL end_ROW

It also causes an overestimation of τsuperscript𝜏\tau^{*}italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT by E[bias(X2)]=1/6Ebiassubscript𝑋216\operatorname{E}[\text{bias}(X_{2})]=1/6roman_E [ bias ( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = 1 / 6 and an inaccurate 2E[BFH(H)]2E𝐵subscript𝐹𝐻𝐻2\operatorname{E}[BF_{H}(H)]2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] calculated as 0.90320.90320.90320.9032. Consequently, for h(X)𝑋h(X)italic_h ( italic_X ), the C~bsubscript~𝐶𝑏\tilde{C}_{b}over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT of 0.07730.07730.07730.0773 is lower than the Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT of 0.14890.14890.14890.1489.

Figure 3 illustrates the calibration plot and RCC for h(X1,X2)subscript𝑋1subscript𝑋2h(X_{1},X_{2})italic_h ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) with and without controlling Z𝑍Zitalic_Z. It shows the distinct influence of confounding bias on moderate calibration, RCC, and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT assessments. In the calibration plot, the red curve significantly deviates from the blue curve as the value of H𝐻Hitalic_H approaches both extremes, near 00 and 2222. Moreover, confounding bias reduces the area between the independence line and the RCC by roughly half.

7 Discussion

In clinical settings, TBPs derived from prior studies offer valuable guidance for physicians and patients in making informed treatment decisions. These TBPs, which may developed from various populations, should be evaluated in the target population before implementation (Riley et al.,, 2024). Observational data, where treatment assignment is not art random, might be the only opportunity for such evaluation. Consequently, addressing confounding bias is crucial when assessing treatment benefits on observational data.

This study evaluated pre-specified TBPs using observational studies and explored how confounding bias influences the evaluation of TBPs in a population-level framework. We delved into two specific metrics, one focusing on calibration and the other on discrimination, and we proposed two bias expressions of calibration and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT. We demonstrated that the failure to control for confounding variables leads to inaccurate assessments of moderate calibration and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT. The impact of confounding bias on the assessment of moderate calibration and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT differs. The two synthetic populations demonstrated lead to two positive bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) functions, which are two examples of many other possible bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) functions. These two bias(X)bias𝑋\text{bias}(X)bias ( italic_X ) functions result E~[BH]>E[BH]~Edelimited-[]conditional𝐵𝐻Econditional𝐵𝐻\tilde{\operatorname{E}}[B\mid H]>\operatorname{E}[B\mid H]over~ start_ARG roman_E end_ARG [ italic_B ∣ italic_H ] > roman_E [ italic_B ∣ italic_H ]; nevertheless, C~b<Cbsubscript~𝐶𝑏subscript𝐶𝑏\tilde{C}_{b}<C_{b}over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT < italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT. In other words, positive confounding bias may lead to overestimation of τsuperscript𝜏\tau^{*}italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) and E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] but underestimation of Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT at least for these choice of the TBP h(x)𝑥h(x)italic_h ( italic_x ).

This study conducted a conceptual analysis, which lays the groundwork for finite-sample estimation. To evaluate pre-determined TBPs using real-world observational data, the primary challenge shifts to estimating τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) and then τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) from the sample. As previously discussed, τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) can be estimated through outcome regression, inverse probability weighting, or a combination of both, such as the doubly robust method (Bang and Robins,, 2005). Estimating each performance metric might have its challenges. For instance, for the calibration curve, we need to estimate the conditional expectation of estimated τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) given H𝐻Hitalic_H. When H𝐻Hitalic_H is discrete, the estimation can rely on the sample average within groups sharing the same H𝐻Hitalic_H. However, the estimation for continuous H𝐻Hitalic_H is non-trivial. For Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, estimating the CDF of H𝐻Hitalic_H and possibly its fH(h)subscript𝑓𝐻f_{H}(h)italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) can be achieved through either the empirical distribution or modelling H𝐻Hitalic_H.

Previous discussion have identified several areas for future research. One might wonder if there is a performance metric for TBPs that is less sensitive to the influence of confounding bias. We examined two performance metrics; however, an investigation of more existing performance metrics is needed to solve this question. Additionally, the provided two synthetic populations assume independent counterfactual outcomes because it is a commonly used assumption in applications. However, the real-world target populations can be way more complex. Further research is needed to examine populations with correlated counterfactual outcomes. Moreover, our conceptual analysis provides a better understanding of the impact of the confounding bias on the TBP evaluation, and the proposed framework can be applied to real data sets. Then, it is natural to explore further which one of the existing CATE estimation methods is more flexible to handle complex τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) function and which gives a more precise TPB assessment for making treatment decisions. Ultimately, the final pieces of brick for finial-sample estimation address the challenges of estimating various performance metrics.

References

  • Abrevaya et al., (2015) Abrevaya, J., Hsu, Y.-C., and Lieli, R. P. (2015). Estimating conditional average treatment effects. Journal of Business & Economic Statistics, 33(4):485–505.
  • Bang and Robins, (2005) Bang, H. and Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4):962–973.
  • Efthimiou et al., (2023) Efthimiou, O., Hoogland, J., Debray, T. P., Seo, M., Furukawa, T. A., Egger, M., and White, I. R. (2023). Measuring the performance of prediction models to personalize treatment choice. Statistics in medicine, 42(8):1188–1206.
  • Foster and Syrgkanis, (2023) Foster, D. J. and Syrgkanis, V. (2023). Orthogonal statistical learning.
  • Hoogland et al., (2022) Hoogland, J., Efthimiou, O., Nguyen, T.-L., and Debray, T. P. (2022). Evaluating individualized treatment effect predictions: a new perspective on discrimination and calibration assessment. arXiv preprint, (arXiv:2209.06101).
  • Imbens, (2003) Imbens, G. W. (2003). Sensitivity to exogeneity assumptions in program evaluation. American Economic Review, 93(2):126–132.
  • Kent et al., (2020) Kent, D. M., Paulus, J. K., Van Klaveren, D., D’Agostino, R., Goodman, S., Hayward, R., Ioannidis, J. P., Patrick-Lake, B., Morton, S., Pencina, M., et al. (2020). The predictive approaches to treatment effect heterogeneity (path) statement. Annals of Internal Medicine, 172:35–45.
  • la Roi-Teeuw et al., (2024) la Roi-Teeuw, H. M., van Royen, F. S., de Hond, A., Zahra, A., de Vries, S., Bartels, R., Carriero, A. J., van Doorn, S., Dunias, Z. S., Kant, I., et al. (2024). Don’t be misled: Three misconceptions about external validation of clinical prediction models. Journal of Clinical Epidemiology, page 111387.
  • Prosperi et al., (2020) Prosperi, M., Guo, Y., Sperrin, M., Koopman, J. S., Min, J. S., He, X., Rich, S., Wang, M., Buchan, I. E., and Bian, J. (2020). Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence, 2(7):369–375.
  • Riley et al., (2024) Riley, R. D., Archer, L., Snell, K. I., Ensor, J., Dhiman, P., Martin, G. P., Bonnett, L. J., and Collins, G. S. (2024). Evaluation of clinical prediction models (part 2): how to undertake an external validation study. bmj, 384.
  • Riley et al., (2019) Riley, R. D., Snell, K. I., Moons, K. G., and Debray, T. P. (2019). Fundamental statistical methods for prognosis research. In Prognosis Research in Health Care, chapter 3, pages 37–68. Oxford University Press.
  • Robertson et al., (2021) Robertson, S. E., Leith, A., Schmid, C. H., and Dahabreh, I. J. (2021). Assessing heterogeneity of treatment effects in observational studies. American Journal of Epidemiology, 190(6):1088–1100.
  • Rosenbaum and Rubin, (1983) Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55.
  • Rubin, (1980) Rubin, D. B. (1980). Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American statistical association, 75(371):591–593.
  • Sadatsafavi et al., (2020) Sadatsafavi, M., Mansournia, M. A., and Gustafson, P. (2020). A threshold-free summary index for quantifying the capacity of covariates to yield efficient treatment rules. Statistics in Medicine, 39:1362–1373.
  • Steyerberg, (2019) Steyerberg, E. W. (2019). Clinical Prediction Models. Springer International Publishing.
  • Van Calster et al., (2019) Van Calster, B., McLernon, D. J., Van Smeden, M., Wynants, L., and Steyerberg, E. W. (2019). Calibration: the achilles heel of predictive analytics. BMC medicine, 17(1):230.
  • Van Calster et al., (2016) Van Calster, B., Nieboer, D., Vergouwe, Y., De Cock, B., Pencina, M. J., and Steyerberg, E. W. (2016). A calibration hierarchy for risk models was defined: from utopia to empirical data. Journal of Clinical Epidemiology, 74:167–176.
  • van Klaveren et al., (2019) van Klaveren, D., Balan, T. A., Steyerberg, E. W., and Kent, D. M. (2019). Models with interactions overestimated heterogeneity of treatment effects and were prone to treatment mistargeting. Journal of Clinical Epidemiology, 114:72–83.
  • van Klaveren et al., (2018) van Klaveren, D., Steyerberg, E. W., Serruys, P. W., and Kent, D. M. (2018). The proposed ‘concordance-statistic for benefit’ provided a useful metric when modeling heterogeneous treatment effects. Journal of Clinical Epidemiology, 94:59–68.
  • Veitch and Zaveri, (2020) Veitch, V. and Zaveri, A. (2020). Sense and sensitivity analysis: Simple post-hoc analysis of bias due to unobserved confounding. Advances in neural information processing systems, 33:10999–11009.
  • Vickers et al., (2007) Vickers, A. J., Kattan, M. W., and Sargent, D. J. (2007). Method for evaluating prediction models that apply the results of randomized trials to individual patients. Trials, 8:1–11.
  • Xia et al., (2023) Xia, Y., Gustafson, P., and Sadatsafavi, M. (2023). Methodological concerns about “concordance-statistic for benefit” as a measure of discrimination in predicting treatment benefit. Diagnostic and Prognostic Research, 7(1):10.
  • Yitzhaki and Olkin, (1991) Yitzhaki, S. and Olkin, I. (1991). Concentration indices and concentration curves. Lecture Notes-Monograph Series, pages 380–392.
  • Zhou and Zhu, (2021) Zhou, N. and Zhu, L. (2021). On ipw-based estimation of conditional average treatment effects. Journal of Statistical Planning and Inference, 215:1–22.

Appendix A: Extra Results in population 1

Confounding Bias Function

The first population is generated using simple linear functions, yet confounding bias bias(X) determined by 18 parameters (coefficients), and covariate X𝑋Xitalic_X is complex. When X𝑋Xitalic_X is fixed at a specific value x𝑥xitalic_x, we depict the bias as a function of the coefficients, observing how the bias fluctuates as these coefficients vary. When α03=0.058subscript𝛼030.058\alpha_{03}=-0.058italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT = - 0.058, and allow α13α03subscript𝛼13subscript𝛼03\alpha_{13}-\alpha_{03}italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT to take any value within the range (0.0557,0.4194)0.05570.4194(0.0557,0.4194)( 0.0557 , 0.4194 ). This interval is determined by α13(0.0011,0.3614)subscript𝛼130.00110.3614\alpha_{13}\in(-0.0011,0.3614)italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ∈ ( - 0.0011 , 0.3614 ) for a valid distribution.

Refer to caption
Figure 4: Absolute bias(X) values under various parameter scenarios. Within the blue, green, and red regions, absolute bias values exceed 0.010.010.010.01, 0.050.050.050.05, and 0.10.10.10.1, respectively. The red dot represents the scenario with β1=0.7621subscript𝛽10.7621\beta_{1}=0.7621italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.7621 and α13α03=0.3717subscript𝛼13subscript𝛼030.3717\alpha_{13}-\alpha_{03}=0.3717italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT = 0.3717, while the green dot corresponds to β1=0subscript𝛽10\beta_{1}=0italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 and α13α03=0.3717subscript𝛼13subscript𝛼030.3717\alpha_{13}-\alpha_{03}=0.3717italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT = 0.3717.

Figure 4 shows the bias(X) across varying strengths of confounding, revealing a complex relationship between the bias and the three parameters (α03,α13,β1)subscript𝛼03subscript𝛼13subscript𝛽1(\alpha_{03},\alpha_{13},\beta_{1})( italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ). When β1=0.7621subscript𝛽10.7621\beta_{1}=0.7621italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.7621 and α13α03=0.3717subscript𝛼13subscript𝛼030.3717\alpha_{13}-\alpha_{03}=0.3717italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT = 0.3717, we have bias(X1=1,X2=1)=0.0640biasformulae-sequencesubscript𝑋11subscript𝑋210.0640\text{bias}(X_{1}=1,X_{2}=1)=0.0640bias ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 ) = 0.0640, bias(X1=1,X2=0)=0.1298biasformulae-sequencesubscript𝑋11subscript𝑋200.1298\text{bias}(X_{1}=1,X_{2}=0)=0.1298bias ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 ) = 0.1298, bias(X1=0,X2=1)=0.0581biasformulae-sequencesubscript𝑋10subscript𝑋210.0581\text{bias}(X_{1}=0,X_{2}=1)=0.0581bias ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 ) = 0.0581, and bias(X1=0,X2=0)=0.1090biasformulae-sequencesubscript𝑋10subscript𝑋200.1090\text{bias}(X_{1}=0,X_{2}=0)=0.1090bias ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 ) = 0.1090 (with all values rounded to four decimal places).

Selected Coefficients and Performance for TBPs

In population 1, we define a distribution \mathbb{P}blackboard_P and a linear function τ(X,Z)𝜏𝑋𝑍\tau(X,Z)italic_τ ( italic_X , italic_Z ) with a total of 18 parameters. To design a moderately calibrated h2(x1,x2)subscript2subscript𝑥1subscript𝑥2h_{2}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) and a strongly calibrated h3(x1,x2)subscript3subscript𝑥1subscript𝑥2h_{3}(x_{1},x_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), we specify

b0subscript𝑏0\displaystyle b_{0}italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT =[(α10α00)+(α13α03)p001p001+p000],absentdelimited-[]subscript𝛼10subscript𝛼00subscript𝛼13subscript𝛼03subscript𝑝001subscript𝑝001subscript𝑝000\displaystyle=\left[(\alpha_{10}-\alpha_{00})+(\alpha_{13}-\alpha_{03})\frac{p% _{001}}{p_{001}+p_{000}}\right],= [ ( italic_α start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ) + ( italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT ) divide start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT end_ARG ] ,
b1subscript𝑏1\displaystyle b_{1}italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =[(α11α01)p101+p100p101+p100+p011+p010+(α12α02)p011+p010p101+p100+p011+p010+\displaystyle=\Bigg{[}(\alpha_{11}-\alpha_{01})\frac{p_{101}+p_{100}}{p_{101}+% p_{100}+p_{011}+p_{010}}+(\alpha_{12}-\alpha_{02})\frac{p_{011}+p_{010}}{p_{10% 1}+p_{100}+p_{011}+p_{010}}+= [ ( italic_α start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT ) divide start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG + ( italic_α start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 02 end_POSTSUBSCRIPT ) divide start_ARG italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG +
(α13α03)(p101+p011p101+p100+p011+p010p001p001+p000)],\displaystyle(\alpha_{13}-\alpha_{03})\left(\frac{p_{101}+p_{011}}{p_{101}+p_{% 100}+p_{011}+p_{010}}-\frac{p_{001}}{p_{001}+p_{000}}\right)\Bigg{]},( italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT ) ( divide start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG - divide start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT end_ARG ) ] ,
b2subscript𝑏2\displaystyle b_{2}italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =[(α11α01)(12p101+p100p101+p100+p011+p010)+\displaystyle=\Bigg{[}(\alpha_{11}-\alpha_{01})\left(1-2\frac{p_{101}+p_{100}}% {p_{101}+p_{100}+p_{011}+p_{010}}\right)+= [ ( italic_α start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT ) ( 1 - 2 divide start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG ) +
(α12α02)(12p011+p010p101+p100+p011+p010)+limit-fromsubscript𝛼12subscript𝛼0212subscript𝑝011subscript𝑝010subscript𝑝101subscript𝑝100subscript𝑝011subscript𝑝010\displaystyle(\alpha_{12}-\alpha_{02})\left(1-2\frac{p_{011}+p_{010}}{p_{101}+% p_{100}+p_{011}+p_{010}}\right)+( italic_α start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 02 end_POSTSUBSCRIPT ) ( 1 - 2 divide start_ARG italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG ) +
(α13α03)(p111p111+p1102p101+p011p101+p100+p011+p010+p001p001+p000)],\displaystyle(\alpha_{13}-\alpha_{03})\left(\frac{p_{111}}{p_{111}+p_{110}}-2% \frac{p_{101}+p_{011}}{p_{101}+p_{100}+p_{011}+p_{010}}+\frac{p_{001}}{p_{001}% +p_{000}}\right)\Bigg{]},( italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT ) ( divide start_ARG italic_p start_POSTSUBSCRIPT 111 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 111 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 110 end_POSTSUBSCRIPT end_ARG - 2 divide start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG + divide start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT end_ARG ) ] ,

and

c0subscript𝑐0\displaystyle c_{0}italic_c start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT =[(α10α00)+(α13α03)p001p001+p000],absentdelimited-[]subscript𝛼10subscript𝛼00subscript𝛼13subscript𝛼03subscript𝑝001subscript𝑝001subscript𝑝000\displaystyle=\left[(\alpha_{10}-\alpha_{00})+(\alpha_{13}-\alpha_{03})\frac{p% _{001}}{p_{001}+p_{000}}\right],= [ ( italic_α start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ) + ( italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT ) divide start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT end_ARG ] ,
c1subscript𝑐1\displaystyle c_{1}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =[(α11α01)+(α13α03)(p101p101+p100p001p001+p000)],absentdelimited-[]subscript𝛼11subscript𝛼01subscript𝛼13subscript𝛼03subscript𝑝101subscript𝑝101subscript𝑝100subscript𝑝001subscript𝑝001subscript𝑝000\displaystyle=\left[(\alpha_{11}-\alpha_{01})+(\alpha_{13}-\alpha_{03})\left(% \frac{p_{101}}{p_{101}+p_{100}}-\frac{p_{001}}{p_{001}+p_{000}}\right)\right],= [ ( italic_α start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT ) + ( italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT ) ( divide start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT end_ARG - divide start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT end_ARG ) ] ,
c2subscript𝑐2\displaystyle c_{2}italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =[(α12α02)+(α13α03)(p011p011+p010p001p001+p000)],absentdelimited-[]subscript𝛼12subscript𝛼02subscript𝛼13subscript𝛼03subscript𝑝011subscript𝑝011subscript𝑝010subscript𝑝001subscript𝑝001subscript𝑝000\displaystyle=\left[(\alpha_{12}-\alpha_{02})+(\alpha_{13}-\alpha_{03})\left(% \frac{p_{011}}{p_{011}+p_{010}}-\frac{p_{001}}{p_{001}+p_{000}}\right)\right],= [ ( italic_α start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 02 end_POSTSUBSCRIPT ) + ( italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT ) ( divide start_ARG italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG - divide start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT end_ARG ) ] ,
c3subscript𝑐3\displaystyle c_{3}italic_c start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT =[(α13α03)(p111p111+p110p101p101+p100p011p011+p010+p001p001+p000)].absentdelimited-[]subscript𝛼13subscript𝛼03subscript𝑝111subscript𝑝111subscript𝑝110subscript𝑝101subscript𝑝101subscript𝑝100subscript𝑝011subscript𝑝011subscript𝑝010subscript𝑝001subscript𝑝001subscript𝑝000\displaystyle=\left[(\alpha_{13}-\alpha_{03})\left(\frac{p_{111}}{p_{111}+p_{1% 10}}-\frac{p_{101}}{p_{101}+p_{100}}-\frac{p_{011}}{p_{011}+p_{010}}+\frac{p_{% 001}}{p_{001}+p_{000}}\right)\right].= [ ( italic_α start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT 03 end_POSTSUBSCRIPT ) ( divide start_ARG italic_p start_POSTSUBSCRIPT 111 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 111 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 110 end_POSTSUBSCRIPT end_ARG - divide start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 101 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 100 end_POSTSUBSCRIPT end_ARG - divide start_ARG italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 011 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 010 end_POSTSUBSCRIPT end_ARG + divide start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT end_ARG start_ARG italic_p start_POSTSUBSCRIPT 001 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 000 end_POSTSUBSCRIPT end_ARG ) ] .

When β1=0subscript𝛽10\beta_{1}=0italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0, the variable Z𝑍Zitalic_Z in population 1 ceases to be a confounding variable, and AX,ZA\perp\!\!\!\perp X,Zitalic_A ⟂ ⟂ italic_X , italic_Z. Consequently, bias(X)=0bias𝑋0\text{bias}(X)=0bias ( italic_X ) = 0, resulting in zero bias for both the E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT.

Refer to caption
Figure 5: Calibration plots and RCCs for h1(X1,X2)subscript1subscript𝑋1subscript𝑋2h_{1}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), h2(X1,X2)subscript2subscript𝑋1subscript𝑋2h_{2}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), and h3(X1,X2)subscript3subscript𝑋1subscript𝑋2h_{3}(X_{1},X_{2})italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) when β1=0subscript𝛽10\beta_{1}=0italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0. The blue dotted curves refer to the E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] and Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, and the red dashed curves refer to E~[BH]~Edelimited-[]conditional𝐵𝐻\tilde{\operatorname{E}}[B\mid H]over~ start_ARG roman_E end_ARG [ italic_B ∣ italic_H ] and C~bsubscript~𝐶𝑏\tilde{C}_{b}over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT.

Appendix B: Propositions for Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT Calculation

The expectation of Maximum-like (continuous) For continuous variables B𝐵Bitalic_B and H𝐻Hitalic_H, we have two independent copies denoted as {(B1,H1),(B2,H2)}subscript𝐵1subscript𝐻1subscript𝐵2subscript𝐻2\{(B_{1},H_{1}),(B_{2},H_{2})\}{ ( italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , ( italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) }. The expectation of Maximum-like follows

E[B1I(H1H2)+B2I(H1<H2)]=2E[BFH(H)],Esubscript𝐵1𝐼subscript𝐻1subscript𝐻2subscript𝐵2𝐼subscript𝐻1subscript𝐻22E𝐵subscript𝐹𝐻𝐻\operatorname{E}[B_{1}I(H_{1}\geq H_{2})+B_{2}I(H_{1}<H_{2})]=2\operatorname{E% }[BF_{H}(H)],roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] ,

where FH(H)subscript𝐹𝐻𝐻F_{H}(H)italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) is the CDF of H𝐻Hitalic_H.

Proof. We demonstrate that the expected value of the Maximum-like for two patient pairs is twice the expected value of B, weighted by its CDF value.

E[B1I(H1H2)+B2I(H1<H2)]Esubscript𝐵1𝐼subscript𝐻1subscript𝐻2subscript𝐵2𝐼subscript𝐻1subscript𝐻2\displaystyle\operatorname{E}[B_{1}I(H_{1}\geq H_{2})+B_{2}I(H_{1}<H_{2})]roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ]
=2E[B1I(H1H2)]=2EH1,H2[E[B1IH1H2H1,H2]]absent2Esubscript𝐵1𝐼subscript𝐻1subscript𝐻22subscriptEsubscript𝐻1subscript𝐻2Econditionalsubscript𝐵1subscript𝐼subscript𝐻1subscript𝐻2subscript𝐻1subscript𝐻2\displaystyle=2\operatorname{E}[B_{1}I(H_{1}\geq H_{2})]=2\operatorname{E}_{H_% {1},H_{2}}\big{[}\operatorname{E}[B_{1}I_{H_{1}\geq H_{2}}\mid H_{1},H_{2}]% \big{]}= 2 roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = 2 roman_E start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∣ italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ]
=2(b1I(h1h2)fB1H1,H2(b1h1,h2)𝑑b)fH2(h2)𝑑h2fH1(h1)𝑑h1absent2subscriptsuperscriptsubscriptsuperscriptsubscriptsuperscriptsubscript𝑏1𝐼subscript1subscript2subscript𝑓conditionalsubscript𝐵1subscript𝐻1subscript𝐻2conditionalsubscript𝑏1subscript1subscript2differential-d𝑏subscript𝑓subscript𝐻2subscript2differential-dsubscript2subscript𝑓subscript𝐻1subscript1differential-dsubscript1\displaystyle=2\int^{\infty}_{-\infty}\int^{\infty}_{-\infty}\left(\int^{% \infty}_{-\infty}b_{1}I(h_{1}\geq h_{2})f_{B_{1}\mid H_{1},H_{2}}(b_{1}\mid h_% {1},h_{2})db\right)f_{H_{2}}(h_{2})dh_{2}f_{H_{1}}(h_{1})dh_{1}= 2 ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT ( ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∣ italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∣ italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_d italic_b ) italic_f start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_d italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_d italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT
=2E[B1H1](h1fH2(h2)𝑑h2)fH1(h1)𝑑h1absent2subscriptsuperscript𝐸delimited-[]conditionalsubscript𝐵1subscript𝐻1subscriptsuperscriptsubscript1subscript𝑓subscript𝐻2subscript2differential-dsubscript2subscript𝑓subscript𝐻1subscript1differential-dsubscript1\displaystyle=2\int^{\infty}_{-\infty}E[B_{1}\mid H_{1}]\left(\int^{h_{1}}_{-% \infty}f_{H_{2}}(h_{2})dh_{2}\right)f_{H_{1}}(h_{1})dh_{1}= 2 ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT italic_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∣ italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] ( ∫ start_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_d italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_d italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT
=2b1FH1(h1)fB1,H1(b1,h1)𝑑h1𝑑h1absent2subscriptsuperscriptsubscriptsuperscriptsubscript𝑏1subscript𝐹subscript𝐻1subscript1subscript𝑓subscript𝐵1subscript𝐻1subscript𝑏1subscript1differential-dsubscript1differential-dsubscript1\displaystyle=2\int^{\infty}_{-\infty}\int^{\infty}_{-\infty}b_{1}F_{H_{1}}(h_% {1})f_{B_{1},H_{1}}(b_{1},h_{1})dh_{1}dh_{1}= 2 ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_d italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_d italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT
=2E[BFH(H)].absent2E𝐵subscript𝐹𝐻𝐻\displaystyle=2\operatorname{E}[BF_{H}(H)].= 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] .

The Gini-like index (continuous) For continuous variables B𝐵Bitalic_B and H𝐻Hitalic_H, the Gini-like index, representing twice the area (A𝐴Aitalic_A) between the line of independence (p𝑝pitalic_p) and the relative concentration curve (R(p)𝑅𝑝R(p)italic_R ( italic_p )), is defined as

2A=2E[BFH(H)]E[B]E[B].2𝐴2𝐸delimited-[]𝐵subscript𝐹𝐻𝐻E𝐵E𝐵2A=\frac{2E[BF_{H}(H)]-\operatorname{E}[B]}{\operatorname{E}[B]}.2 italic_A = divide start_ARG 2 italic_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B ] end_ARG start_ARG roman_E [ italic_B ] end_ARG .

We assume that E[B]>0𝐸delimited-[]𝐵0E[B]>0italic_E [ italic_B ] > 0.

Proof. Note that R(p)=E[BI(Hh)]E[B]𝑅𝑝E𝐵𝐼𝐻E𝐵R(p)=\frac{\operatorname{E}[BI(H\leq h)]}{\operatorname{E}[B]}italic_R ( italic_p ) = divide start_ARG roman_E [ italic_B italic_I ( italic_H ≤ italic_h ) ] end_ARG start_ARG roman_E [ italic_B ] end_ARG, where p𝑝pitalic_p represents the p𝑝pitalic_p-th quantile concerning the value of H𝐻Hitalic_H. To find twice the area between line of independence p𝑝pitalic_p and the RCC R(p)𝑅𝑝R(p)italic_R ( italic_p ), we start with:

2AE[B]2𝐴E𝐵\displaystyle 2A\operatorname{E}[B]2 italic_A roman_E [ italic_B ] =201(pE[B]hpE[BH=h]fH(h)𝑑h)𝑑pabsent2subscriptsuperscript10𝑝E𝐵subscriptsuperscriptsubscript𝑝Econditional𝐵𝐻subscript𝑓𝐻differential-ddifferential-d𝑝\displaystyle=2\int^{1}_{0}\left(p\operatorname{E}[B]-\int^{h_{p}}_{-\infty}% \operatorname{E}[B\mid H=h]f_{H}(h)dh\right)dp= 2 ∫ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_p roman_E [ italic_B ] - ∫ start_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT roman_E [ italic_B ∣ italic_H = italic_h ] italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) italic_d italic_h ) italic_d italic_p
=E[B]201hpE[BH=h]fH(h)𝑑h𝑑pabsentE𝐵2subscriptsuperscript10subscriptsuperscriptsubscript𝑝Econditional𝐵𝐻subscript𝑓𝐻differential-ddifferential-d𝑝\displaystyle=\operatorname{E}[B]-2\int^{1}_{0}\int^{h_{p}}_{-\infty}% \operatorname{E}[B\mid H=h]f_{H}(h)dhdp= roman_E [ italic_B ] - 2 ∫ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∫ start_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT roman_E [ italic_B ∣ italic_H = italic_h ] italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) italic_d italic_h italic_d italic_p
=E[B]2E[BH=h]fH(h)(1FH(h))𝑑habsentE𝐵2subscriptsuperscriptEconditional𝐵𝐻subscript𝑓𝐻1subscript𝐹𝐻differential-d\displaystyle=\operatorname{E}[B]-2\int^{\infty}_{-\infty}\operatorname{E}[B% \mid H=h]f_{H}(h)(1-F_{H}(h))dh= roman_E [ italic_B ] - 2 ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT roman_E [ italic_B ∣ italic_H = italic_h ] italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) ( 1 - italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) ) italic_d italic_h
=2E[BH=h]fH(h)FH(h)𝑑hE[B]absent2subscriptsuperscriptEconditional𝐵𝐻subscript𝑓𝐻subscript𝐹𝐻differential-dE𝐵\displaystyle=2\int^{\infty}_{-\infty}\operatorname{E}[B\mid H=h]f_{H}(h)F_{H}% (h)dh-\operatorname{E}[B]= 2 ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT roman_E [ italic_B ∣ italic_H = italic_h ] italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) italic_d italic_h - roman_E [ italic_B ]
=2E[BFH(H)]E[B],absent2E𝐵subscript𝐹𝐻𝐻E𝐵\displaystyle=2\operatorname{E}[BF_{H}(H)]-\operatorname{E}[B],= 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B ] ,

where hpsubscript𝑝h_{p}italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT represents hhitalic_h value at the p𝑝pitalic_p-th quantile. Therefore, we have

2A=2E[BFH(H)]E[B]E[B].2𝐴2E𝐵subscript𝐹𝐻𝐻E𝐵E𝐵2A=\frac{2\operatorname{E}[BF_{H}(H)]-\operatorname{E}[B]}{\operatorname{E}[B]}.2 italic_A = divide start_ARG 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B ] end_ARG start_ARG roman_E [ italic_B ] end_ARG .

The expectation of Maximum-like (discrete) For discrete variables B𝐵Bitalic_B and H𝐻Hitalic_H, we have two independent copies denoted as {(B1,H1),(B2,H2)}subscript𝐵1subscript𝐻1subscript𝐵2subscript𝐻2\{(B_{1},H_{1}),(B_{2},H_{2})\}{ ( italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , ( italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) }. The expectation of Maximum-like follows

E[B1I(H1H2)+B2I(H1<H2)]=2E[BFH(H)]E[BfH(H)],Esubscript𝐵1𝐼subscript𝐻1subscript𝐻2subscript𝐵2𝐼subscript𝐻1subscript𝐻22E𝐵subscript𝐹𝐻𝐻E𝐵subscript𝑓𝐻𝐻\operatorname{E}[B_{1}I(H_{1}\geq H_{2})+B_{2}I(H_{1}<H_{2})]=2\operatorname{E% }[BF_{H}(H)]-\operatorname{E}[Bf_{H}(H)],roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] ,

where fHsubscript𝑓𝐻f_{H}italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT denotes the probability mass function (PMF) of H𝐻Hitalic_H.

Proof.

E[B1I(H1H2)+B2I(H1<H2)]Esubscript𝐵1𝐼subscript𝐻1subscript𝐻2subscript𝐵2𝐼subscript𝐻1subscript𝐻2\displaystyle\operatorname{E}[B_{1}I(H_{1}\geq H_{2})+B_{2}I(H_{1}<H_{2})]roman_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ]
=2E[E[B1I(H1>H2)H1,H2]]+E[E[B1I(H1=H2)H1,H2]]absent2E𝐸delimited-[]conditionalsubscript𝐵1𝐼subscript𝐻1subscript𝐻2subscript𝐻1subscript𝐻2E𝐸delimited-[]conditionalsubscript𝐵1𝐼subscript𝐻1subscript𝐻2subscript𝐻1subscript𝐻2\displaystyle=2\operatorname{E}\left[E[B_{1}I(H_{1}>H_{2})\mid H_{1},H_{2}]% \right]+\operatorname{E}\left[E[B_{1}I(H_{1}=H_{2})\mid H_{1},H_{2}]\right]= 2 roman_E [ italic_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∣ italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ] + roman_E [ italic_E [ italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_I ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∣ italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ]
=2h1[b1b1P(B1=b1H1=h1)](h2I(h2<h1)P(H2=h2))P(H1=h1)+absentlimit-from2subscriptsubscript1delimited-[]subscriptsubscript𝑏1subscript𝑏1Psubscript𝐵1conditionalsubscript𝑏1subscript𝐻1subscript1subscriptsubscript2𝐼subscript2subscript1Psubscript𝐻2subscript2Psubscript𝐻1subscript1\displaystyle=2\sum_{h_{1}}\left[\sum_{b_{1}}b_{1}\operatorname{P}(B_{1}=b_{1}% \mid H_{1}=h_{1})\right]\left(\sum_{h_{2}}I(h_{2}<h_{1})\operatorname{P}(H_{2}% =h_{2})\right)\operatorname{P}(H_{1}=h_{1})+= 2 ∑ start_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ∑ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_P ( italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∣ italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ] ( ∑ start_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_I ( italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) roman_P ( italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) roman_P ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) +
h1[b1b1P(B1=b1H1=h1)](h2I(h2=h1)P(H2=h2))P(H1=h1)subscriptsubscript1delimited-[]subscriptsubscript𝑏1subscript𝑏1Psubscript𝐵1conditionalsubscript𝑏1subscript𝐻1subscript1subscriptsubscript2𝐼subscript2subscript1Psubscript𝐻2subscript2Psubscript𝐻1subscript1\displaystyle\sum_{h_{1}}\left[\sum_{b_{1}}b_{1}\operatorname{P}(B_{1}=b_{1}% \mid H_{1}=h_{1})\right]\left(\sum_{h_{2}}I(h_{2}=h_{1})\operatorname{P}(H_{2}% =h_{2})\right)\operatorname{P}(H_{1}=h_{1})∑ start_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ∑ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_P ( italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∣ italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ] ( ∑ start_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_I ( italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) roman_P ( italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) roman_P ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT )
=2h1b1b1P(H1<h1)P(B1=b1,H1=h1)+h1b1b1P(H1=h1)P(B1=b1,H1=h1)absent2subscriptsubscript1subscriptsubscript𝑏1subscript𝑏1Psubscript𝐻1subscript1Psubscript𝐵1subscript𝑏1subscript𝐻1subscript1subscriptsubscript1subscriptsubscript𝑏1subscript𝑏1Psubscript𝐻1subscript1Psubscript𝐵1subscript𝑏1subscript𝐻1subscript1\displaystyle=2\sum_{h_{1}}\sum_{b_{1}}b_{1}\operatorname{P}(H_{1}<h_{1})% \operatorname{P}(B_{1}=b_{1},H_{1}=h_{1})+\sum_{h_{1}}\sum_{b_{1}}b_{1}% \operatorname{P}(H_{1}=h_{1})\operatorname{P}(B_{1}=b_{1},H_{1}=h_{1})= 2 ∑ start_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_P ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) roman_P ( italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + ∑ start_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_P ( italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) roman_P ( italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT )
=2E[BFH(H)]E[BfH(H)].absent2E𝐵subscript𝐹𝐻𝐻E𝐵subscript𝑓𝐻𝐻\displaystyle=2\operatorname{E}[BF_{H}(H)]-\operatorname{E}[Bf_{H}(H)].= 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] .

The Gini-like index (discrete) For discrete variables B𝐵Bitalic_B and H𝐻Hitalic_H, the Gini-like index, representing twice the area (A𝐴Aitalic_A) between the line of independence (p𝑝pitalic_p) and the relative concentration curve (R(p)𝑅𝑝R(p)italic_R ( italic_p )), is defined as

2A=2E[BFH(H)]E[BfH(H)]E[B]E[B].2𝐴2E𝐵subscript𝐹𝐻𝐻E𝐵subscript𝑓𝐻𝐻E𝐵E𝐵2A=\frac{2\operatorname{E}[BF_{H}(H)]-\operatorname{E}[Bf_{H}(H)]-% \operatorname{E}[B]}{\operatorname{E}[B]}.2 italic_A = divide start_ARG 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B ] end_ARG start_ARG roman_E [ italic_B ] end_ARG .

We assume that E[B]>0𝐸delimited-[]𝐵0E[B]>0italic_E [ italic_B ] > 0.

Proof. For a discrete variable H𝐻Hitalic_H with k𝑘kitalic_k distinct values, patients are ranked by their value of H𝐻Hitalic_H in ascending order to plot the RCC. Assume that h1<h2<h3<<hksubscript123subscript𝑘h_{1}<h2<h3<\cdots<h_{k}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_h 2 < italic_h 3 < ⋯ < italic_h start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT with the probability p1,p2,p3,,pksubscript𝑝1subscript𝑝2subscript𝑝3subscript𝑝𝑘p_{1},p_{2},p_{3},\cdots,p_{k}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , ⋯ , italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT corresponding. Note that i=1kpi=1superscriptsubscript𝑖1𝑘subscript𝑝𝑖1\sum_{i=1}^{k}p_{i}=1∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1, and we can express 2E[BFH(H)]2E𝐵subscript𝐹𝐻𝐻2\operatorname{E}[BF_{H}(H)]2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] as

2E[BFH(H)]2E𝐵subscript𝐹𝐻𝐻\displaystyle 2\operatorname{E}[BF_{H}(H)]2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] =2(p1E[BI(H=h1)]+(p1+p2)E[BI(H=h2)]++E[I(H=hk)]).absent2subscript𝑝1E𝐵𝐼𝐻subscript1subscript𝑝1subscript𝑝2E𝐵𝐼𝐻subscript2E𝐼𝐻subscript𝑘\displaystyle=2\left(p_{1}\operatorname{E}[BI(H=h_{1})]+(p_{1}+p_{2})% \operatorname{E}[BI(H=h_{2})]+\cdots+\operatorname{E}[I(H=h_{k})]\right).= 2 ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_E [ italic_B italic_I ( italic_H = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ] + ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) roman_E [ italic_B italic_I ( italic_H = italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] + ⋯ + roman_E [ italic_I ( italic_H = italic_h start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ] ) .

If B0𝐵0B\geq 0italic_B ≥ 0, area A𝐴Aitalic_A would be bounded between 00 and 0.50.50.50.5. We calculate the area A𝐴Aitalic_A as 0.50.50.50.5 minus the sum of the area of one triangle and (k1)𝑘1(k-1)( italic_k - 1 ) trapezoids, which is

2A2𝐴\displaystyle 2A2 italic_A =1E[B]((1pk)E[B](p1+p2)E[BI(Hh1)](pk1+pk)E[B]I(Hhk1)]).\displaystyle=\frac{1}{\operatorname{E}[B]}\left((1-p_{k})\operatorname{E}[B]-% (p_{1}+p_{2})\operatorname{E}[BI(H\leq h_{1})]-\cdots-(p_{k-1}+p_{k})% \operatorname{E}[B]I(H\leq h_{k-1})]\right).= divide start_ARG 1 end_ARG start_ARG roman_E [ italic_B ] end_ARG ( ( 1 - italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_E [ italic_B ] - ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) roman_E [ italic_B italic_I ( italic_H ≤ italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ] - ⋯ - ( italic_p start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT + italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_E [ italic_B ] italic_I ( italic_H ≤ italic_h start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) ] ) .

We then express each E[BI(Hhi)]E𝐵𝐼𝐻subscript𝑖\operatorname{E}[BI(H\leq h_{i})]roman_E [ italic_B italic_I ( italic_H ≤ italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ] as the sum of treatment benefit averages for disjoint groups of patients, with each group having no overlap. For instance,

E[BI(Hh3)]=E[BI(H=h1)]+E[BI(H=h2)]+E[BI(H=h3)].E𝐵𝐼𝐻subscript3E𝐵𝐼𝐻subscript1E𝐵𝐼𝐻subscript2E𝐵𝐼𝐻subscript3\operatorname{E}[BI(H\leq h_{3})]=\operatorname{E}[BI(H=h_{1})]+\operatorname{% E}[BI(H=h_{2})]+\operatorname{E}[BI(H=h_{3})].roman_E [ italic_B italic_I ( italic_H ≤ italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) ] = roman_E [ italic_B italic_I ( italic_H = italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ] + roman_E [ italic_B italic_I ( italic_H = italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] + roman_E [ italic_B italic_I ( italic_H = italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) ] .

It is possible to show that for k<𝑘k<\inftyitalic_k < ∞

2E[BFH(H)]E[B]E[B]2A2E𝐵subscript𝐹𝐻𝐻E𝐵E𝐵2𝐴\displaystyle\frac{2\operatorname{E}[BF_{H}(H)]-\operatorname{E}[B]}{% \operatorname{E}[B]}-2Adivide start_ARG 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] - roman_E [ italic_B ] end_ARG start_ARG roman_E [ italic_B ] end_ARG - 2 italic_A =i=1kpiE[BI(H=hi)]E[B]absentsuperscriptsubscript𝑖1𝑘subscript𝑝𝑖E𝐵𝐼𝐻subscript𝑖E𝐵\displaystyle=\frac{\sum_{i=1}^{k}p_{i}\operatorname{E}[BI(H=h_{i})]}{% \operatorname{E}[B]}= divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_E [ italic_B italic_I ( italic_H = italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ] end_ARG start_ARG roman_E [ italic_B ] end_ARG
=E[BfH(H)]E[B].absentE𝐵subscript𝑓𝐻𝐻E𝐵\displaystyle=\frac{\operatorname{E}[Bf_{H}(H)]}{\operatorname{E}[B]}.= divide start_ARG roman_E [ italic_B italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] end_ARG start_ARG roman_E [ italic_B ] end_ARG .

If some patients have B<0𝐵0B<0italic_B < 0, the area A𝐴Aitalic_A can take a value greater than 0.50.50.50.5, but the expression of A𝐴Aitalic_A stays the same. In other words, the Gini-like index could be greater than 1111 with some negative treatment benefit values in the target population.

Aside: For the univariate case, the properties of the expectation of maximum follow similar patterns. For a continuous variable X𝑋Xitalic_X (X0𝑋0X\geq 0italic_X ≥ 0) with two independent copies {X1,X2}subscript𝑋1subscript𝑋2\{X_{1},X_{2}\}{ italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }, we observe that

E[max(X1,X2)]=2E[XFX(X)],Esubscript𝑋1subscript𝑋22E𝑋subscript𝐹𝑋𝑋\operatorname{E}[\max(X_{1},X_{2})]=2\operatorname{E}[XF_{X}(X)],roman_E [ roman_max ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = 2 roman_E [ italic_X italic_F start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_X ) ] ,

where FX(X)subscript𝐹𝑋𝑋F_{X}(X)italic_F start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_X ) represents the CDF of X𝑋Xitalic_X. Similarly, for a discrete variable X𝑋Xitalic_X with two independent copies {X1,X2}subscript𝑋1subscript𝑋2\{X_{1},X_{2}\}{ italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }, the equation becomes

E[max(X1,X2)]=2E[XFX(X)]E[XfX(X)],Esubscript𝑋1subscript𝑋22E𝑋subscript𝐹𝑋𝑋𝐸delimited-[]𝑋subscript𝑓𝑋𝑋\operatorname{E}[\max(X_{1},X_{2})]=2\operatorname{E}[XF_{X}(X)]-E[Xf_{X}(X)],roman_E [ roman_max ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] = 2 roman_E [ italic_X italic_F start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_X ) ] - italic_E [ italic_X italic_f start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_X ) ] ,

where fXsubscript𝑓𝑋f_{X}italic_f start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT denotes the PMF of X𝑋Xitalic_X. Moreover, for the Gini coefficient (2A2superscript𝐴2A^{\prime}2 italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT), we have

E[XFX(X)]E[X]E[X]2A=E[XfX(X)]E[X].E𝑋subscript𝐹𝑋𝑋E𝑋E𝑋2superscript𝐴E𝑋subscript𝑓𝑋𝑋E𝑋\frac{\operatorname{E}[XF_{X}(X)]-\operatorname{E}[X]}{\operatorname{E}[X]}-2A% ^{\prime}=\frac{\operatorname{E}[Xf_{X}(X)]}{\operatorname{E}[X]}.divide start_ARG roman_E [ italic_X italic_F start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_X ) ] - roman_E [ italic_X ] end_ARG start_ARG roman_E [ italic_X ] end_ARG - 2 italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = divide start_ARG roman_E [ italic_X italic_f start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_X ) ] end_ARG start_ARG roman_E [ italic_X ] end_ARG .

Appendix C: Closed-form Expressions in Setting 2

With Control for Confounding

Recall that H:=h(X)assign𝐻𝑋H:=h(X)italic_H := italic_h ( italic_X ). As τ𝜏\tauitalic_τ is not a one-to-one map in population 2, we begin by calculating the pre-image of the point (τs(x),h)subscript𝜏𝑠𝑥(\tau_{s}(x),h)( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_h ). The pre-image of any point (τs(x),h)subscript𝜏𝑠𝑥(\tau_{s}(x),h)( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_h ) consists of two points given by the set

{(x1=τs(x),x2=hτs(x)),(x1=hτs(x),x2=τs(x))},formulae-sequencesubscript𝑥1subscript𝜏𝑠𝑥subscript𝑥2subscript𝜏𝑠𝑥formulae-sequencesubscript𝑥1subscript𝜏𝑠𝑥subscript𝑥2subscript𝜏𝑠𝑥\left\{(x_{1}=\tau_{s}(x),x_{2}=h-\tau_{s}(x)),(x_{1}=h-\tau_{s}(x),x_{2}=\tau% _{s}(x))\right\},{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_h - italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ) , ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_h - italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ) } ,

which exists when 2τs(x)h2subscript𝜏𝑠𝑥2\tau_{s}(x)\geq h2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ≥ italic_h and hτs(x)subscript𝜏𝑠𝑥h\geq\tau_{s}(x)italic_h ≥ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ).

Observe that the pre-image of any point (τs(x),h)subscript𝜏𝑠𝑥(\tau_{s}(x),h)( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_h ) comprises two points achieved by interchanging the values of x1subscript𝑥1x_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and x2subscript𝑥2x_{2}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. These two points correspond to two scenarios: x1x2subscript𝑥1subscript𝑥2x_{1}\geq x_{2}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and x1<x2subscript𝑥1subscript𝑥2x_{1}<x_{2}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Fortunately, in each scenario, the max operator functions as a one-to-one map. Overall, the joint density of (τs(x),H)subscript𝜏𝑠𝑥𝐻(\tau_{s}(x),H)( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_H ) is the sum of the two parts, which is

fτs(x),H(τs(x),h)={2,τs(x)h,h2τs(x),0τs(x)1,0,otherwise.subscript𝑓subscript𝜏𝑠𝑥𝐻subscript𝜏𝑠𝑥cases2formulae-sequencesubscript𝜏𝑠𝑥formulae-sequence2subscript𝜏𝑠𝑥0subscript𝜏𝑠𝑥10otherwisef_{\tau_{s}(x),H}(\tau_{s}(x),h)=\begin{cases}2,&\tau_{s}(x)\leq h,h\leq 2\tau% _{s}(x),0\leq\tau_{s}(x)\leq 1,\\ 0,&\text{otherwise}.\end{cases}italic_f start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_H end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_h ) = { start_ROW start_CELL 2 , end_CELL start_CELL italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ≤ italic_h , italic_h ≤ 2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , 0 ≤ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ≤ 1 , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW

Now, we can recalculate the marginal PDFs from joint fτs(x),H(τs(x),h)subscript𝑓subscript𝜏𝑠𝑥𝐻subscript𝜏𝑠𝑥f_{\tau_{s}(x),H}(\tau_{s}(x),h)italic_f start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_H end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_h ):

fτs(x)(τs(x))subscript𝑓subscript𝜏𝑠𝑥subscript𝜏𝑠𝑥\displaystyle f_{\tau_{s}(x)}(\tau_{s}(x))italic_f start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ) ={2τs(x),0τs(x)1,0,otherwise,absentcases2subscript𝜏𝑠𝑥0subscript𝜏𝑠𝑥10otherwise\displaystyle=\begin{cases}2\tau_{s}(x),&0\leq\tau_{s}(x)\leq 1,\\ 0,&\text{otherwise},\end{cases}= { start_ROW start_CELL 2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , end_CELL start_CELL 0 ≤ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ≤ 1 , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise , end_CELL end_ROW
fH(h)subscript𝑓𝐻\displaystyle f_{H}(h)italic_f start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) ={h,0h<1,2h,1h2,0,otherwise.absentcases012120otherwise\displaystyle=\begin{cases}h,&0\leq h<1,\\ 2-h,&1\leq h\leq 2,\\ 0,&\text{otherwise}.\end{cases}= { start_ROW start_CELL italic_h , end_CELL start_CELL 0 ≤ italic_h < 1 , end_CELL end_ROW start_ROW start_CELL 2 - italic_h , end_CELL start_CELL 1 ≤ italic_h ≤ 2 , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW

The CDF of H𝐻Hitalic_H is

FH(h)={0,h<0,h2/2,0h<1,1(2h)2/2,1h<2,1,2h.subscript𝐹𝐻cases00superscript22011superscript2221212\displaystyle F_{H}(h)=\begin{cases}0,&h<0,\\ h^{2}/2,&0\leq h<1,\\ 1-(2-h)^{2}/2,&1\leq h<2,\\ 1,&2\leq h.\end{cases}italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) = { start_ROW start_CELL 0 , end_CELL start_CELL italic_h < 0 , end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 2 , end_CELL start_CELL 0 ≤ italic_h < 1 , end_CELL end_ROW start_ROW start_CELL 1 - ( 2 - italic_h ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 2 , end_CELL start_CELL 1 ≤ italic_h < 2 , end_CELL end_ROW start_ROW start_CELL 1 , end_CELL start_CELL 2 ≤ italic_h . end_CELL end_ROW

With the joint PDF of (τs(X),H)subscript𝜏𝑠𝑋𝐻(\tau_{s}(X),H)( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) , italic_H ), we calculate the conditional density function of (τs(X)H)conditionalsubscript𝜏𝑠𝑋𝐻(\tau_{s}(X)\mid H)( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ∣ italic_H ) by definition:

fτs(X)H(τs(x)h)subscript𝑓conditionalsubscript𝜏𝑠𝑋𝐻conditionalsubscript𝜏𝑠𝑥\displaystyle f_{\tau_{s}(X)\mid H}(\tau_{s}(x)\mid h)italic_f start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ∣ italic_H end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ∣ italic_h ) ={2/h,τs(x)h,h2τs(x),0τs(x)1,0<h1,2/(2h),τs(x)h,h2τs(x),0τs(x)1,1<h<2,0,otherwise.absentcases2formulae-sequenceformulae-sequencesubscript𝜏𝑠𝑥formulae-sequence2subscript𝜏𝑠𝑥0subscript𝜏𝑠𝑥10122formulae-sequenceformulae-sequencesubscript𝜏𝑠𝑥formulae-sequence2subscript𝜏𝑠𝑥0subscript𝜏𝑠𝑥1120otherwise\displaystyle=\begin{cases}2/h,&\tau_{s}(x)\leq h,h\leq 2\tau_{s}(x),0\leq\tau% _{s}(x)\leq 1,0<h\leq 1,\\ 2/(2-h),&\tau_{s}(x)\leq h,h\leq 2\tau_{s}(x),0\leq\tau_{s}(x)\leq 1,1<h<2,\\ 0,&\text{otherwise}.\end{cases}= { start_ROW start_CELL 2 / italic_h , end_CELL start_CELL italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ≤ italic_h , italic_h ≤ 2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , 0 ≤ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ≤ 1 , 0 < italic_h ≤ 1 , end_CELL end_ROW start_ROW start_CELL 2 / ( 2 - italic_h ) , end_CELL start_CELL italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ≤ italic_h , italic_h ≤ 2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , 0 ≤ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ≤ 1 , 1 < italic_h < 2 , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW

Finally, we can compute E[τs(X)H=h]Econditionalsubscript𝜏𝑠𝑋𝐻\operatorname{E}[\tau_{s}(X)\mid H=h]roman_E [ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ∣ italic_H = italic_h ] by the definition of conditional expectation:

E[τs(X)H=h]Econditionalsubscript𝜏𝑠𝑋𝐻\displaystyle\operatorname{E}[\tau_{s}(X)\mid H=h]roman_E [ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ∣ italic_H = italic_h ] =τs(x)fτs(X)H(τs(x)h)𝑑τs(x)absentsubscriptsuperscriptsubscript𝜏𝑠𝑥subscript𝑓conditionalsubscript𝜏𝑠𝑋𝐻conditionalsubscript𝜏𝑠𝑥differential-dsubscript𝜏𝑠𝑥\displaystyle=\int^{\infty}_{-\infty}\tau_{s}(x)f_{\tau_{s}(X)\mid H}(\tau_{s}% (x)\mid h)d\tau_{s}(x)= ∫ start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - ∞ end_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) italic_f start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ∣ italic_H end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ∣ italic_h ) italic_d italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x )
={3h4,0<h1,1h2/42h,1<h<2,0,otherwise.absentcases34011superscript242120otherwise\displaystyle=\begin{cases}\frac{3h}{4},&0<h\leq 1,\\ \frac{1-h^{2}/4}{2-h},&1<h<2,\\ 0,&\text{otherwise}.\end{cases}= { start_ROW start_CELL divide start_ARG 3 italic_h end_ARG start_ARG 4 end_ARG , end_CELL start_CELL 0 < italic_h ≤ 1 , end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 - italic_h start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 4 end_ARG start_ARG 2 - italic_h end_ARG , end_CELL start_CELL 1 < italic_h < 2 , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW

Then, we calculate the Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT for H𝐻Hitalic_H by computing

E[τs(X)FH(H)]Esubscript𝜏𝑠𝑋subscript𝐹𝐻𝐻\displaystyle\operatorname{E}[\tau_{s}(X)F_{H}(H)]roman_E [ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] =hτs(x)τs(x)FH(h)fτs(X),H(τs(x),h)𝑑τs(x)𝑑habsentsubscriptsubscriptsubscript𝜏𝑠𝑥subscript𝜏𝑠𝑥subscript𝐹𝐻subscript𝑓subscript𝜏𝑠𝑋𝐻subscript𝜏𝑠𝑥differential-dsubscript𝜏𝑠𝑥differential-d\displaystyle=\int_{h}\int_{\tau_{s}(x)}\tau_{s}(x)F_{H}(h)f_{\tau_{s}(X),H}(% \tau_{s}(x),h)d\tau_{s}(x)dh= ∫ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) end_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) italic_f start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) , italic_H end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_h ) italic_d italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) italic_d italic_h
=01h2h2τs(x)h22𝑑τs(x)𝑑h+12h212τs(x)(1(2h)22)𝑑τs(x)𝑑habsentsubscriptsuperscript10subscriptsuperscript22subscript𝜏𝑠𝑥superscript22differential-dsubscript𝜏𝑠𝑥differential-dsubscriptsuperscript21subscriptsuperscript122subscript𝜏𝑠𝑥1superscript222differential-dsubscript𝜏𝑠𝑥differential-d\displaystyle=\int^{1}_{0}\int^{h}_{\frac{h}{2}}2\tau_{s}(x)\frac{h^{2}}{2}d% \tau_{s}(x)dh+\int^{2}_{1}\int^{1}_{\frac{h}{2}}2\tau_{s}(x)\left(1-\frac{(2-h% )^{2}}{2}\right)d\tau_{s}(x)dh= ∫ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∫ start_POSTSUPERSCRIPT italic_h end_POSTSUPERSCRIPT start_POSTSUBSCRIPT divide start_ARG italic_h end_ARG start_ARG 2 end_ARG end_POSTSUBSCRIPT 2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) divide start_ARG italic_h start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG italic_d italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) italic_d italic_h + ∫ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∫ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT divide start_ARG italic_h end_ARG start_ARG 2 end_ARG end_POSTSUBSCRIPT 2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ( 1 - divide start_ARG ( 2 - italic_h ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ) italic_d italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) italic_d italic_h
=2(380+19120)=E[BFH(H)].absent238019120𝐸delimited-[]𝐵subscript𝐹𝐻𝐻\displaystyle=2\left(\frac{3}{80}+\frac{19}{120}\right)=E[BF_{H}(H)].= 2 ( divide start_ARG 3 end_ARG start_ARG 80 end_ARG + divide start_ARG 19 end_ARG start_ARG 120 end_ARG ) = italic_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] .
Cb,hsubscript𝐶𝑏\displaystyle C_{b,h}italic_C start_POSTSUBSCRIPT italic_b , italic_h end_POSTSUBSCRIPT =1τ2E[BFH(H)]=12/34(380+19120)=0.1489362.absent1superscript𝜏2𝐸delimited-[]𝐵subscript𝐹𝐻𝐻1234380191200.1489362\displaystyle=1-\frac{\tau^{*}}{2E[BF_{H}(H)]}=1-\frac{2/3}{4\left(\frac{3}{80% }+\frac{19}{120}\right)}=0.1489362.= 1 - divide start_ARG italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_E [ italic_B italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] end_ARG = 1 - divide start_ARG 2 / 3 end_ARG start_ARG 4 ( divide start_ARG 3 end_ARG start_ARG 80 end_ARG + divide start_ARG 19 end_ARG start_ARG 120 end_ARG ) end_ARG = 0.1489362 .

Furthermore, we can calculate the Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT for τs(X)subscript𝜏𝑠𝑋\tau_{s}(X)italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) by computing

E[τs(X)Fτs(X)(τs(X))]Esubscript𝜏𝑠𝑋subscript𝐹subscript𝜏𝑠𝑋subscript𝜏𝑠𝑋\displaystyle\operatorname{E}[\tau_{s}(X)F_{\tau_{s}(X)}(\tau_{s}(X))]roman_E [ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) italic_F start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ) ] =hτs(x)τs(x)Fτs(X)(τs(x))fτs(X),H(τs(x),h)𝑑τs(x)𝑑habsentsubscriptsubscriptsubscript𝜏𝑠𝑥subscript𝜏𝑠𝑥subscript𝐹subscript𝜏𝑠𝑋subscript𝜏𝑠𝑥subscript𝑓subscript𝜏𝑠𝑋𝐻subscript𝜏𝑠𝑥differential-dsubscript𝜏𝑠𝑥differential-d\displaystyle=\int_{h}\int_{\tau_{s}(x)}\tau_{s}(x)F_{\tau_{s}(X)}(\tau_{s}(x)% )f_{\tau_{s}(X),H}(\tau_{s}(x),h)d\tau_{s}(x)dh= ∫ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) end_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) italic_F start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) ) italic_f start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) , italic_H end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) , italic_h ) italic_d italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) italic_d italic_h
=01τs(x)2τs(x)2τs(x)3𝑑h𝑑τs(x)absentsubscriptsuperscript10subscriptsuperscript2subscript𝜏𝑠𝑥subscript𝜏𝑠𝑥2subscript𝜏𝑠superscript𝑥3differential-ddifferential-dsubscript𝜏𝑠𝑥\displaystyle=\int^{1}_{0}\int^{2\tau_{s}(x)}_{\tau_{s}(x)}2\tau_{s}(x)^{3}dhd% \tau_{s}(x)= ∫ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∫ start_POSTSUPERSCRIPT 2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) end_POSTSUBSCRIPT 2 italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_d italic_h italic_d italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x )
=25=E[BFτs(X)(τs(X))].absent25E𝐵subscript𝐹subscript𝜏𝑠𝑋subscript𝜏𝑠𝑋\displaystyle=\frac{2}{5}=\operatorname{E}[BF_{\tau_{s}(X)}(\tau_{s}(X))].= divide start_ARG 2 end_ARG start_ARG 5 end_ARG = roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ) ] .
Cb,τs(x)subscript𝐶𝑏subscript𝜏𝑠𝑥\displaystyle C_{b,\tau_{s}(x)}italic_C start_POSTSUBSCRIPT italic_b , italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x ) end_POSTSUBSCRIPT =1τ2E[BFτs(X)(τs(X))]=12/34/5=0.166666667.absent1superscript𝜏2E𝐵subscript𝐹subscript𝜏𝑠𝑋subscript𝜏𝑠𝑋123450.166666667\displaystyle=1-\frac{\tau^{*}}{2\operatorname{E}[BF_{\tau_{s}(X)}(\tau_{s}(X)% )]}=1-\frac{2/3}{4/5}=0.166666667.= 1 - divide start_ARG italic_τ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG start_ARG 2 roman_E [ italic_B italic_F start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ) ] end_ARG = 1 - divide start_ARG 2 / 3 end_ARG start_ARG 4 / 5 end_ARG = 0.166666667 .

Without Control for Confounding

The confounding bias is defined as

bias(X)=(E[YA=1,X]E[YA=0,X])τs(X).bias𝑋Econditional𝑌𝐴1𝑋Econditional𝑌𝐴0𝑋subscript𝜏𝑠𝑋\displaystyle\text{bias}(X)=\left(\operatorname{E}[Y\mid A=1,X]-\operatorname{% E}[Y\mid A=0,X]\right)-\tau_{s}(X).bias ( italic_X ) = ( roman_E [ italic_Y ∣ italic_A = 1 , italic_X ] - roman_E [ italic_Y ∣ italic_A = 0 , italic_X ] ) - italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) .

We want to figure out the expression of E[YA=1,X]E[YA=0,X]Econditional𝑌𝐴1𝑋Econditional𝑌𝐴0𝑋\operatorname{E}[Y\mid A=1,X]-\operatorname{E}[Y\mid A=0,X]roman_E [ italic_Y ∣ italic_A = 1 , italic_X ] - roman_E [ italic_Y ∣ italic_A = 0 , italic_X ] given μa(x,z),a{0,1}subscript𝜇𝑎𝑥𝑧𝑎01\mu_{a}(x,z),a\in\{0,1\}italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_x , italic_z ) , italic_a ∈ { 0 , 1 } by integrating Z𝑍Zitalic_Z out of μa(x,z)subscript𝜇𝑎𝑥𝑧\mu_{a}(x,z)italic_μ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_x , italic_z )

E[YA=1,X=x]Econditional𝑌𝐴1𝑋𝑥\displaystyle\operatorname{E}[Y\mid A=1,X=x]roman_E [ italic_Y ∣ italic_A = 1 , italic_X = italic_x ] =zE[YA=1,X=x,Z=z]fZX,A(zx,1)𝑑zabsentsubscript𝑧Econditional𝑌𝐴1𝑋𝑥𝑍𝑧subscript𝑓conditional𝑍𝑋𝐴conditional𝑧𝑥1differential-d𝑧\displaystyle=\int_{z}\operatorname{E}[Y\mid A=1,X=x,Z=z]f_{Z\mid X,A}(z\mid x% ,1)dz= ∫ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT roman_E [ italic_Y ∣ italic_A = 1 , italic_X = italic_x , italic_Z = italic_z ] italic_f start_POSTSUBSCRIPT italic_Z ∣ italic_X , italic_A end_POSTSUBSCRIPT ( italic_z ∣ italic_x , 1 ) italic_d italic_z
=z2z(12max(x1,x2)+max(x2,z)+110x1)𝑑zabsentsubscript𝑧2𝑧12subscript𝑥1subscript𝑥2subscript𝑥2𝑧110subscript𝑥1differential-d𝑧\displaystyle=\int_{z}2z\left(\frac{1}{2}\max(x_{1},x_{2})+\max(x_{2},z)+\frac% {1}{10}x_{1}\right)dz= ∫ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT 2 italic_z ( divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_max ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + roman_max ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_z ) + divide start_ARG 1 end_ARG start_ARG 10 end_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_d italic_z
=12max(x1,x2)+13x23+23+110x1.absent12subscript𝑥1subscript𝑥213superscriptsubscript𝑥2323110subscript𝑥1\displaystyle=\frac{1}{2}\max(x_{1},x_{2})+\frac{1}{3}x_{2}^{3}+\frac{2}{3}+% \frac{1}{10}x_{1}.= divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_max ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + divide start_ARG 1 end_ARG start_ARG 3 end_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + divide start_ARG 2 end_ARG start_ARG 3 end_ARG + divide start_ARG 1 end_ARG start_ARG 10 end_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT .
E[YA=0,X=x]Econditional𝑌𝐴0𝑋𝑥\displaystyle\operatorname{E}[Y\mid A=0,X=x]roman_E [ italic_Y ∣ italic_A = 0 , italic_X = italic_x ] =zE[YA=0,X=x,Z=z]fZX,A(zx,0)𝑑zabsentsubscript𝑧Econditional𝑌𝐴0𝑋𝑥𝑍𝑧subscript𝑓conditional𝑍𝑋𝐴conditional𝑧𝑥0differential-d𝑧\displaystyle=\int_{z}\operatorname{E}[Y\mid A=0,X=x,Z=z]f_{Z\mid X,A}(z\mid x% ,0)dz= ∫ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT roman_E [ italic_Y ∣ italic_A = 0 , italic_X = italic_x , italic_Z = italic_z ] italic_f start_POSTSUBSCRIPT italic_Z ∣ italic_X , italic_A end_POSTSUBSCRIPT ( italic_z ∣ italic_x , 0 ) italic_d italic_z
=z2(1z)(12max(x1,x2)+max(x2,z)+110x1)𝑑zabsentsubscript𝑧21𝑧12subscript𝑥1subscript𝑥2subscript𝑥2𝑧110subscript𝑥1differential-d𝑧\displaystyle=\int_{z}2(1-z)\left(-\frac{1}{2}\max(x_{1},x_{2})+\max(x_{2},z)+% \frac{1}{10}x_{1}\right)dz= ∫ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT 2 ( 1 - italic_z ) ( - divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_max ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + roman_max ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_z ) + divide start_ARG 1 end_ARG start_ARG 10 end_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_d italic_z
=12max(x1,x2)+13+x2213x23+110x1.absent12subscript𝑥1subscript𝑥213superscriptsubscript𝑥2213superscriptsubscript𝑥23110subscript𝑥1\displaystyle=-\frac{1}{2}\max(x_{1},x_{2})+\frac{1}{3}+x_{2}^{2}-\frac{1}{3}x% _{2}^{3}+\frac{1}{10}x_{1}.= - divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_max ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + divide start_ARG 1 end_ARG start_ARG 3 end_ARG + italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 3 end_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG 10 end_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT .

Overall, we have

E[YA=1,X=x]E[YA=0,X=x]Econditional𝑌𝐴1𝑋𝑥Econditional𝑌𝐴0𝑋𝑥\displaystyle\operatorname{E}[Y\mid A=1,X=x]-\operatorname{E}[Y\mid A=0,X=x]roman_E [ italic_Y ∣ italic_A = 1 , italic_X = italic_x ] - roman_E [ italic_Y ∣ italic_A = 0 , italic_X = italic_x ] =τ(x1,x2)+13x22+23x23.absent𝜏subscript𝑥1subscript𝑥213superscriptsubscript𝑥2223superscriptsubscript𝑥23\displaystyle=\tau(x_{1},x_{2})+\frac{1}{3}-x_{2}^{2}+\frac{2}{3}x_{2}^{3}.= italic_τ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + divide start_ARG 1 end_ARG start_ARG 3 end_ARG - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 2 end_ARG start_ARG 3 end_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT .

We denote E[YA=1,X=x]E[YA=0,X=x]Econditional𝑌𝐴1𝑋𝑥Econditional𝑌𝐴0𝑋𝑥\operatorname{E}[Y\mid A=1,X=x]-\operatorname{E}[Y\mid A=0,X=x]roman_E [ italic_Y ∣ italic_A = 1 , italic_X = italic_x ] - roman_E [ italic_Y ∣ italic_A = 0 , italic_X = italic_x ] as D𝐷Ditalic_D, and we have

τs(X)=max(X1,X2),subscript𝜏𝑠𝑋subscript𝑋1subscript𝑋2\displaystyle\tau_{s}(X)=\max(X_{1},X_{2}),italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) = roman_max ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ,
H=X1+X2,𝐻subscript𝑋1subscript𝑋2\displaystyle H=X_{1}+X_{2},italic_H = italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ,
D=τs(X)+bias(X),𝐷subscript𝜏𝑠𝑋bias𝑋\displaystyle D=\tau_{s}(X)+\text{bias}(X),italic_D = italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) + bias ( italic_X ) ,

where bias(x1,x2)=13x22+23x23biassubscript𝑥1subscript𝑥213superscriptsubscript𝑥2223superscriptsubscript𝑥23\text{bias}(x_{1},x_{2})=\frac{1}{3}-x_{2}^{2}+\frac{2}{3}x_{2}^{3}bias ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 3 end_ARG - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 2 end_ARG start_ARG 3 end_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT is the confounding bias.

We compute

E[D]E𝐷\displaystyle\operatorname{E}[D]roman_E [ italic_D ] =E[τs(X)]+E[bias(X)]absentEsubscript𝜏𝑠𝑋Ebias𝑋\displaystyle=\operatorname{E}[\tau_{s}(X)]+\operatorname{E}[\text{bias}(X)]= roman_E [ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ] + roman_E [ bias ( italic_X ) ]
=23+x2(13x22+23x23)fX2(x2)𝑑x2absent23subscriptsubscript𝑥213superscriptsubscript𝑥2223superscriptsubscript𝑥23subscript𝑓subscript𝑋2subscript𝑥2differential-dsubscript𝑥2\displaystyle=\frac{2}{3}+\int_{x_{2}}\left(\frac{1}{3}-x_{2}^{2}+\frac{2}{3}x% _{2}^{3}\right)f_{X_{2}}(x_{2})dx_{2}= divide start_ARG 2 end_ARG start_ARG 3 end_ARG + ∫ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG 3 end_ARG - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 2 end_ARG start_ARG 3 end_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) italic_f start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_d italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT
=23+16=56.absent231656\displaystyle=\frac{2}{3}+\frac{1}{6}=\frac{5}{6}.= divide start_ARG 2 end_ARG start_ARG 3 end_ARG + divide start_ARG 1 end_ARG start_ARG 6 end_ARG = divide start_ARG 5 end_ARG start_ARG 6 end_ARG .

Then, we calculate E[DH]Econditional𝐷𝐻\operatorname{E}[D\mid H]roman_E [ italic_D ∣ italic_H ], which is improper E[BH]Econditional𝐵𝐻\operatorname{E}[B\mid H]roman_E [ italic_B ∣ italic_H ] calculated with confounding bias

E[DH]Econditional𝐷𝐻\displaystyle\operatorname{E}[D\mid H]roman_E [ italic_D ∣ italic_H ] =E[τs(X)H]+E[bias(X)H].absentEconditionalsubscript𝜏𝑠𝑋𝐻Econditionalbias𝑋𝐻\displaystyle=\operatorname{E}[\tau_{s}(X)\mid H]+\operatorname{E}[\text{bias}% (X)\mid H].= roman_E [ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ∣ italic_H ] + roman_E [ bias ( italic_X ) ∣ italic_H ] .

To calculate E[bias(X)H]Econditionalbias𝑋𝐻\operatorname{E}[\text{bias}(X)\mid H]roman_E [ bias ( italic_X ) ∣ italic_H ], we need to figure out the joint distribution of (X2,H)subscript𝑋2𝐻(X_{2},H)( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_H ) and then the conditional distribution of (X2H)conditionalsubscript𝑋2𝐻(X_{2}\mid H)( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∣ italic_H ). As hhitalic_h is injective function, we can get

fX2,H(x2,h)=fX2,X1+X2(x2,x1+x2)=fX2,X1(x2,x1)|J|=1,subscript𝑓subscript𝑋2𝐻subscript𝑥2subscript𝑓subscript𝑋2subscript𝑋1subscript𝑋2subscript𝑥2subscript𝑥1subscript𝑥2subscript𝑓subscript𝑋2subscript𝑋1subscript𝑥2subscript𝑥1𝐽1f_{X_{2},H}(x_{2},h)=f_{X_{2},X_{1}+X_{2}}(x_{2},x_{1}+x_{2})=f_{X_{2},X_{1}}(% x_{2},x_{1})|J|=1,italic_f start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_H end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_h ) = italic_f start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = italic_f start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) | italic_J | = 1 ,

where 0x210subscript𝑥210\leq x_{2}\leq 10 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 1, 0h2020\leq h\leq 20 ≤ italic_h ≤ 2, h1x21subscript𝑥2h-1\leq x_{2}italic_h - 1 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and x2hsubscript𝑥2x_{2}\leq hitalic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_h. The conditional distribution should be

fX2H(x2h)subscript𝑓conditionalsubscript𝑋2𝐻conditionalsubscript𝑥2\displaystyle f_{X_{2}\mid H}(x_{2}\mid h)italic_f start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∣ italic_H end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∣ italic_h ) ={1h,0x21,0<h1,h1x2,x2h,12h,0x21,1h<2,h1x2,x2h,0,otherwise.absentcases1formulae-sequence0subscript𝑥2101formulae-sequence1subscript𝑥2subscript𝑥212formulae-sequence0subscript𝑥2112formulae-sequence1subscript𝑥2subscript𝑥20otherwise\displaystyle=\begin{cases}\frac{1}{h},&0\leq x_{2}\leq 1,0<h\leq 1,h-1\leq x_% {2},x_{2}\leq h,\\ \frac{1}{2-h},&0\leq x_{2}\leq 1,1\leq h<2,h-1\leq x_{2},x_{2}\leq h,\\ 0,&\text{otherwise}.\end{cases}= { start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_h end_ARG , end_CELL start_CELL 0 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 1 , 0 < italic_h ≤ 1 , italic_h - 1 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_h , end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG 2 - italic_h end_ARG , end_CELL start_CELL 0 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 1 , 1 ≤ italic_h < 2 , italic_h - 1 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_h , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW
E[bias(X2)H=h]Econditionalbiassubscript𝑋2𝐻\displaystyle\operatorname{E}[\text{bias}(X_{2})\mid H=h]roman_E [ bias ( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∣ italic_H = italic_h ] ={16h,0<h1,16(2h),1<h<2,0,otherwise.absentcases1601162120otherwise\displaystyle=\begin{cases}\frac{1}{6h},&0<h\leq 1,\\ \frac{1}{6(2-h)},&1<h<2,\\ 0,&\text{otherwise}.\end{cases}= { start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG 6 italic_h end_ARG , end_CELL start_CELL 0 < italic_h ≤ 1 , end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG 6 ( 2 - italic_h ) end_ARG , end_CELL start_CELL 1 < italic_h < 2 , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW
E~[BH=h]\displaystyle\tilde{\operatorname{E}}_{[}B\mid H=h]over~ start_ARG roman_E end_ARG start_POSTSUBSCRIPT [ end_POSTSUBSCRIPT italic_B ∣ italic_H = italic_h ] =E[τs(X)H=h]+E[bias(X2)H=h]absentEconditionalsubscript𝜏𝑠𝑋𝐻Econditionalbiassubscript𝑋2𝐻\displaystyle=\operatorname{E}[\tau_{s}(X)\mid H=h]+\operatorname{E}[\text{% bias}(X_{2})\mid H=h]= roman_E [ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) ∣ italic_H = italic_h ] + roman_E [ bias ( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∣ italic_H = italic_h ]
={3h4+16h,0<h1,1h2/42h+16(2h),1<h<2,0,otherwise.absentcases3416011superscript242162120otherwise\displaystyle=\begin{cases}\frac{3h}{4}+\frac{1}{6h},&0<h\leq 1,\\ \frac{1-h^{2}/4}{2-h}+\frac{1}{6(2-h)},&1<h<2,\\ 0,&\text{otherwise}.\end{cases}= { start_ROW start_CELL divide start_ARG 3 italic_h end_ARG start_ARG 4 end_ARG + divide start_ARG 1 end_ARG start_ARG 6 italic_h end_ARG , end_CELL start_CELL 0 < italic_h ≤ 1 , end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 - italic_h start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 4 end_ARG start_ARG 2 - italic_h end_ARG + divide start_ARG 1 end_ARG start_ARG 6 ( 2 - italic_h ) end_ARG , end_CELL start_CELL 1 < italic_h < 2 , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW

Finally, we calculate the Cbsubscript𝐶𝑏C_{b}italic_C start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT for H𝐻Hitalic_H by computing

E[bias(X2)FH(H)]Ebiassubscript𝑋2subscript𝐹𝐻𝐻\displaystyle\operatorname{E}[\text{bias}(X_{2})F_{H}(H)]roman_E [ bias ( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] =x2h(13x22+23x23)FH(h)fX2,H(x2,h)𝑑x2𝑑habsentsubscriptsubscript𝑥2subscript13superscriptsubscript𝑥2223superscriptsubscript𝑥23subscript𝐹𝐻subscript𝑓subscript𝑋2𝐻subscript𝑥2differential-dsubscript𝑥2differential-d\displaystyle=\int_{x_{2}}\int_{h}\left(\frac{1}{3}-x_{2}^{2}+\frac{2}{3}x_{2}% ^{3}\right)F_{H}(h)f_{X_{2},H}(x_{2},h)dx_{2}dh= ∫ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG 3 end_ARG - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 2 end_ARG start_ARG 3 end_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_h ) italic_f start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_H end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_h ) italic_d italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_d italic_h
=13504+431260,absent13504431260\displaystyle=\frac{13}{504}+\frac{43}{1260},= divide start_ARG 13 end_ARG start_ARG 504 end_ARG + divide start_ARG 43 end_ARG start_ARG 1260 end_ARG ,

and

E[DFH(H)]E𝐷subscript𝐹𝐻𝐻\displaystyle\operatorname{E}[DF_{H}(H)]roman_E [ italic_D italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] =E[τs(X)FH(H)]+E[bias(X2)FH(H)]absentEsubscript𝜏𝑠𝑋subscript𝐹𝐻𝐻Ebiassubscript𝑋2subscript𝐹𝐻𝐻\displaystyle=\operatorname{E}[\tau_{s}(X)F_{H}(H)]+\operatorname{E}[\text{% bias}(X_{2})F_{H}(H)]= roman_E [ italic_τ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_X ) italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] + roman_E [ bias ( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ]
=2(380+19120)+(13504+431260).absent23801912013504431260\displaystyle=2\left(\frac{3}{80}+\frac{19}{120}\right)+\left(\frac{13}{504}+% \frac{43}{1260}\right).= 2 ( divide start_ARG 3 end_ARG start_ARG 80 end_ARG + divide start_ARG 19 end_ARG start_ARG 120 end_ARG ) + ( divide start_ARG 13 end_ARG start_ARG 504 end_ARG + divide start_ARG 43 end_ARG start_ARG 1260 end_ARG ) .

Therefore,

C~b,h=1E[D]2E[DFH(H)]=15/62(2(380+19120)+(13504+431260))=0.07732865.subscript~𝐶𝑏1E𝐷2E𝐷subscript𝐹𝐻𝐻1562238019120135044312600.07732865\displaystyle\tilde{C}_{b,h}=1-\frac{\operatorname{E}[D]}{2\operatorname{E}[DF% _{H}(H)]}=1-\frac{5/6}{2\left(2\left(\frac{3}{80}+\frac{19}{120}\right)+\left(% \frac{13}{504}+\frac{43}{1260}\right)\right)}=0.07732865.over~ start_ARG italic_C end_ARG start_POSTSUBSCRIPT italic_b , italic_h end_POSTSUBSCRIPT = 1 - divide start_ARG roman_E [ italic_D ] end_ARG start_ARG 2 roman_E [ italic_D italic_F start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_H ) ] end_ARG = 1 - divide start_ARG 5 / 6 end_ARG start_ARG 2 ( 2 ( divide start_ARG 3 end_ARG start_ARG 80 end_ARG + divide start_ARG 19 end_ARG start_ARG 120 end_ARG ) + ( divide start_ARG 13 end_ARG start_ARG 504 end_ARG + divide start_ARG 43 end_ARG start_ARG 1260 end_ARG ) ) end_ARG = 0.07732865 .