Parameter identification in linear non-Gaussian causal models under general confounding

Daniele Tramontano Technical University of Munich; School of Computation, Information and Technology, Germany Mathias Drton Technical University of Munich; School of Computation, Information and Technology, Germany Jalal Etesami Technical University of Munich; School of Computation, Information and Technology, Germany
Abstract

Linear non-Gaussian causal models postulate that each random variable is a linear function of parent variables and non-Gaussian exogenous error terms. We study identification of the linear coefficients when such models contain latent variables. Our focus is on the commonly studied acyclic setting, where each model corresponds to a directed acyclic graph (DAG). For this case, prior literature has demonstrated that connections to overcomplete independent component analysis yield effective criteria to decide parameter identifiability in latent variable models. However, this connection is based on the assumption that the observed variables linearly depend on the latent variables. Departing from this assumption, we treat models that allow for arbitrary non-linear latent confounding. Our main result is a graphical criterion that is necessary and sufficient for deciding the generic identifiability of direct causal effects. Moreover, we provide an algorithmic implementation of the criterion with a run time that is polynomial in the number of observed variables. Finally, we report on estimation heuristics based on the identification result, explore a generalization to models with feedback loops, and provide new results on the identifiability of the causal graph.

Keywords: Causal effect, graphical model, independent component analysis, latent variable model, structural causal model

1 Introduction

Graphical models, directed or undirected, are ubiquitous in modern statistical practice for modeling multivariate distributions; see, e.g., Lauritzen (1996); Maathuis et al. (2019). In particular, structural equation models (SEMs) associated with directed acyclic graphs (DAGs) provide a concise and effective way of stating the additional assumptions necessary to identify the causal parameters of interest (Pearl, 2009; Spirtes et al., 2000). As argued in Pearl (2017), understanding a causal phenomenon for linear SEMs is often a necessary step towards a generalized understanding of the same in a nonparametric framework.

The specific framework we focus on in this paper are linear non-Gaussian models that allow for general, possibly non-linear, confounding. Each such model is naturally associated to an acyclic directed mixed graph (ADMG); compare Richardson et al. (2023) and references therein. Let X=(Xv)vV𝑋subscriptsubscript𝑋𝑣𝑣𝑉X=(X_{v})_{v\in V}italic_X = ( italic_X start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v ∈ italic_V end_POSTSUBSCRIPT be a collection of observed random variables, and let 𝒢=(V,E,E)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}})caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ) be an ADMG whose vertices index the random variables Xvsubscript𝑋𝑣X_{v}italic_X start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT and which contains two types of edge sets E,EV×V{(v,v):vV}subscript𝐸subscript𝐸𝑉𝑉conditional-set𝑣𝑣𝑣𝑉E_{\rightarrow{}},E_{\leftrightarrow{}}\subseteq~{}V\times~{}V\setminus\{(v,v)% :v\in V\}italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ⊆ italic_V × italic_V ∖ { ( italic_v , italic_v ) : italic_v ∈ italic_V }. The edges in Esubscript𝐸E_{\rightarrow{}}italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT are directed, and we depict them by uvabsent𝑢𝑣u\xrightarrow[]{}vitalic_u start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW italic_v. Those in Esubscript𝐸E_{\leftrightarrow{}}italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT are bidirected and depicted by uvabsent𝑢𝑣u\xleftrightarrow{}vitalic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v. The model encoded by 𝒢𝒢\mathcal{G}caligraphic_G posits that

Xv=w:wvEλwvXw+εv,vV,formulae-sequencesubscript𝑋𝑣subscript:𝑤𝑤𝑣subscript𝐸subscript𝜆𝑤𝑣subscript𝑋𝑤subscript𝜀𝑣𝑣𝑉X_{v}=\sum_{w\,:\,w\to v\in E_{\rightarrow{}}}\lambda_{wv}X_{w}+\varepsilon_{v% },\quad v\in V,italic_X start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_w : italic_w → italic_v ∈ italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_λ start_POSTSUBSCRIPT italic_w italic_v end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT + italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT , italic_v ∈ italic_V , (1.1)

where possible latent confounding is subsumed in the error variables εvsubscript𝜀𝑣\varepsilon_{v}italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT, which are allowed to be dependent in accordance with the bidirected edges in Esubscript𝐸E_{\leftrightarrow{}}italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT. In particular, if vwEabsent𝑣𝑤subscript𝐸v\xleftrightarrow{}w\in E_{\leftrightarrow{}}italic_v start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_w ∈ italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT, then the two errors εvsubscript𝜀𝑣\varepsilon_{v}italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT and εwsubscript𝜀𝑤\varepsilon_{w}italic_ε start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT may be (arbitrarily) dependent.

Example 1.1.

To illustrate the above definition, consider the graph from Fig. 1. It specifies an instrumental variable model for the joint distribution of three observed variables. The equations from (1.1) take the form:

X1subscript𝑋1\displaystyle X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =ε1,absentsubscript𝜀1\displaystyle=\varepsilon_{1},= italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , X2subscript𝑋2\displaystyle X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =λ12X1+ε2,absentsubscript𝜆12subscript𝑋1subscript𝜀2\displaystyle=\lambda_{12}X_{1}+\varepsilon_{2},= italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , X3subscript𝑋3\displaystyle X_{3}italic_X start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT =λ23X2+ε3,absentsubscript𝜆23subscript𝑋2subscript𝜀3\displaystyle=\lambda_{23}X_{2}+\varepsilon_{3},= italic_λ start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_ε start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , (1.2)

where ε1subscript𝜀1\varepsilon_{1}italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is independent of (ε2,ε3)subscript𝜀2subscript𝜀3(\varepsilon_{2},\varepsilon_{3})( italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_ε start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) but ε2subscript𝜀2\varepsilon_{2}italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and ε3subscript𝜀3\varepsilon_{3}italic_ε start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT may be dependent.

X1::subscript𝑋1absentX_{1}:italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : Tax RateX2::subscript𝑋2absentX_{2}:italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT : Mom’s SmokingX3::subscript𝑋3absentX_{3}:italic_X start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT : Baby’s Weightλ12subscript𝜆12\lambda_{12}italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPTλ23subscript𝜆23\lambda_{23}italic_λ start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPTConfounders
Figure 1: Instrumental variable graph based on Evans and Ringel (1999).

In this paper, we treat the problem of deciding which of the direct causal effects λwvsubscript𝜆𝑤𝑣\lambda_{wv}italic_λ start_POSTSUBSCRIPT italic_w italic_v end_POSTSUBSCRIPT in (1.1) can be identified from the joint distribution of the vector of observed variables X𝑋Xitalic_X. Our main results give a complete graphical characterization in terms of the ADMG 𝒢𝒢\mathcal{G}caligraphic_G, an efficient algorithm to check the resulting identifiability criterion, and a simple practical estimation method that is based on empirical measures of dependences among estimates of the errors (εv)vVsubscriptsubscript𝜀𝑣𝑣𝑉(\varepsilon_{v})_{v\in V}( italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v ∈ italic_V end_POSTSUBSCRIPT. We should highlight that our characterization targets generic identifiability, which is the notion most suitable for problems such as the instrumental variable model from Example 1.1. There, the key coefficient of interest λ23subscript𝜆23\lambda_{23}italic_λ start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT is identified as a ratio of covariances, Cov[X1,X3]/Cov[X1,X2]Covsubscript𝑋1subscript𝑋3Covsubscript𝑋1subscript𝑋2\text{Cov}[X_{1},X_{3}]/\text{Cov}[X_{1},X_{2}]Cov [ italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ] / Cov [ italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ], but only if the denominator is nonzero which requires the genericity constraint that λ120subscript𝜆120\lambda_{12}\not=0italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≠ 0. As part of our results, we also develop a framework for making genericity conditions for the infinite-dimensional set of non-Gaussian error distributions, which we justify via cumulants truncated at arbitrary order.

1.1 Related Work

For fully nonparametric SEMs, the ID algorithm (Shpitser and Pearl, 2006; Kivva et al., 2022; Shpitser, 2023; Kivva et al., 2023a) is sound and complete for determining global identifiability in a given ADMG. In contrast to the generic setting treated in this paper, global identifiability requires identifiability under every single distribution in the model. It follows from the work of Drton et al. (2011) that the graphical criterion underpinning the ID algorithm also applies to global identifiability within linear Gaussian models. However, the graphical prerequisites for achieving global identification are frequently overly restrictive. For example, any ADMG containing a bow (a pair of nodes, u,vV𝑢𝑣𝑉u,v\in Vitalic_u , italic_v ∈ italic_V, such that uv,uv𝒢absent𝑢𝑣𝑢𝑣𝒢u\to v,u\xleftrightarrow{}v\in\mathcal{G}italic_u → italic_v , italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∈ caligraphic_G) would fail to meet the criteria for global identifiability, thus overlooking significant scenarios such as the IV model illustrated in Fig. 1. Consequently, when studying linear SEMs, researchers have shifted their focus towards generic identifiability results, for which much progress has been recently, but for which a complete characterization is still lacking; see, e.g., Kumor et al. (2020); Barber et al. (2022).

Non-Gaussianity of the error term has been extensively employed to achieve identifiability of the graphical structure of causal models; see Shimizu (2022) for a recent account. In contrast, its application to causal effect identification has received little attention (Salehkaleybar et al., 2020; Kivva et al., 2023b; Shuai et al., 2023). All the works just mentioned explicitly model the confounding as linear. This approach, on the one hand, allows one to draw on the vast literature on overcomplete independent component analysis (OICA) in order to obtain stronger identifiability results (Eriksson and Koivunen, 2004). On the other hand, however, it restricts the possible confounding structures. Moreover, as opposed to ICA in the fully observed case, overcomplete ICA is not separable (Eriksson and Koivunen, 2004, Thm. 4), implying that the only algorithms for solving OICA that come with theoretical guarantees require making parametric distributional assumptions and using ad-hoc EM-type algorithms to solve the optimization problem; see, e.g., Lewicki and Sejnowski (2000).

The work most similar to ours in terms of distributional assumptions is that of Wang and Drton (2023), which focuses on structural identifiability. Wang and Drton (2023) note that bow-free acyclic graphs are identifiable from observational data and provide an estimation algorithm for such graphs. Liu et al. (2021) extended the algorithm to learn graphs with multi-directed edges.

1.2 Organization of the Paper

The rest of the paper is organized as follows. Section 1.3 contains standard graphical model notation used in the rest of the paper. In Section 2, we formally define the identifiability problem we study. Section 3 contains the main results of our work; we provide a necessary and sufficient graphical condition for generic identifiability in the model under study. In Section 4, we prove that our criterion can be certified in polynomial time in the size of the graph. Section 5 contains a detailed analysis of the genericity assumption. In Section 6, we apply our results from Section 3 to the identifiability of the causal graph, providing new insights on the model equivalence of two ADMGs. In Section 7, we provide partial results about the identification for cyclic models. In Section 8, we note that when the identification criterion is met, the parameters can be estimated as the solution to a suitable optimization problem, and we present a simulation study to assess the performance of the estimation method. In Section 9, we draw final conclusions and suggest future research directions. Appendix A contains further preliminary material and details of the proofs.

1.3 Notation

v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTv3subscript𝑣3v_{3}italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTv4subscript𝑣4v_{4}italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPTλ12subscript𝜆12\lambda_{12}italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPTλ13subscript𝜆13\lambda_{13}italic_λ start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPTλ24subscript𝜆24\lambda_{24}italic_λ start_POSTSUBSCRIPT 24 end_POSTSUBSCRIPTλ34subscript𝜆34\lambda_{34}italic_λ start_POSTSUBSCRIPT 34 end_POSTSUBSCRIPT
Figure 2: An acyclic directed mixed graphs (ADMG) with 4 nodes.

A mixed graph is a triple 𝒢=(V,E,E)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}})caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ), where E,EV×V{(v,v):vV}subscript𝐸subscript𝐸𝑉𝑉conditional-set𝑣𝑣𝑣𝑉E_{\rightarrow{}},E_{\leftrightarrow{}}\subset~{}V\times~{}V\setminus\{(v,v):v% \in V\}italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ⊂ italic_V × italic_V ∖ { ( italic_v , italic_v ) : italic_v ∈ italic_V }. We depict the pairs in Esubscript𝐸E_{\rightarrow{}}italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT by uvabsent𝑢𝑣u\xrightarrow[]{}vitalic_u start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW italic_v and the ones in Esubscript𝐸E_{\leftrightarrow{}}italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT by uvabsent𝑢𝑣u\xleftrightarrow{}vitalic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v; we refer to them as directed and bidirected edges, respectively. For an example, see Fig. 2.

Let u,v,Vu,v,\in Vitalic_u , italic_v , ∈ italic_V be two vertices in 𝒢𝒢\mathcal{G}caligraphic_G. A directed path from u𝑢uitalic_u to v𝑣vitalic_v is a sequence of nodes π=(u=v0,,vk=v)𝜋formulae-sequence𝑢subscript𝑣0subscript𝑣𝑘𝑣\pi=(u=v_{0},\dots,v_{k}=v)italic_π = ( italic_u = italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_v ) such that vivi+1Esubscript𝑣𝑖subscript𝑣𝑖1subscript𝐸v_{i}\to v_{i+1}\in E_{\rightarrow{}}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT → italic_v start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ∈ italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT for all i=0,,k1𝑖0𝑘1i=0,\dots,k-1italic_i = 0 , … , italic_k - 1. This includes the case k=0𝑘0k=0italic_k = 0, where u=v𝑢𝑣u=vitalic_u = italic_v and the path has no edges; we call such a path trivial. We denote by 𝒫(u,v)𝒫𝑢𝑣\mathcal{P}(u,v)caligraphic_P ( italic_u , italic_v ) the set of all directed paths from u𝑢uitalic_u to v𝑣vitalic_v.

A directed cycle is a non-trivial directed path from a node u𝑢uitalic_u to itself. The graph 𝒢𝒢\mathcal{G}caligraphic_G is acyclic if it contains no directed cycles; we refer to this class of graphs as acyclic directed mixed graphs (ADMG). Fig. 2 shows one instance. If the graph is acyclic, we can define a causal order on the nodes of 𝒢𝒢\mathcal{G}caligraphic_G, that is, a total order \leq on V𝑉Vitalic_V such that uv𝑢𝑣u\leq vitalic_u ≤ italic_v whenever 𝒫(u,v)𝒫𝑢𝑣\mathcal{P}(u,v)\not=\emptysetcaligraphic_P ( italic_u , italic_v ) ≠ ∅. When considering parameter matrices associated to 𝒢𝒢\mathcal{G}caligraphic_G, we will typically fix a causal order \leq and assume that the vertices in V𝑉Vitalic_V are enumerated as v1,,vpsubscript𝑣1subscript𝑣𝑝v_{1},\dots,v_{p}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT with ij𝑖𝑗i\leq jitalic_i ≤ italic_j with vivjsubscript𝑣𝑖subscript𝑣𝑗v_{i}\leq v_{j}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT; compare Fig. 2.

We will consider the following genealogical relations that are commonly used to indicate relationships between the vertices of an ADMG (parents, ancestors, children, descendants, siblings):

pa(v)pa𝑣\displaystyle\mathop{\rm pa}\nolimits(v)roman_pa ( italic_v ) :={uV:uv𝒢},assignabsentconditional-set𝑢𝑉𝑢𝑣𝒢\displaystyle:=\{u\in V\>:\>u\to v\in\mathcal{G}\},\qquad:= { italic_u ∈ italic_V : italic_u → italic_v ∈ caligraphic_G } , an(v):={uV:𝒫(u,v)𝒢},assignan𝑣conditional-set𝑢𝑉𝒫𝑢𝑣𝒢\displaystyle\mathop{\rm an}\nolimits(v):=\{u\in V\>:\>\mathcal{P}(u,v)\neq% \emptyset\in\mathcal{G}\},roman_an ( italic_v ) := { italic_u ∈ italic_V : caligraphic_P ( italic_u , italic_v ) ≠ ∅ ∈ caligraphic_G } ,
ch(v)ch𝑣\displaystyle\mathop{\rm ch}\nolimits(v)roman_ch ( italic_v ) :={uV:vu𝒢},assignabsentconditional-set𝑢𝑉𝑣𝑢𝒢\displaystyle:=\{u\in V\>:\>v\to u\in\mathcal{G}\},\qquad:= { italic_u ∈ italic_V : italic_v → italic_u ∈ caligraphic_G } , de(v):={uV:𝒫(v,u)𝒢},assignde𝑣conditional-set𝑢𝑉𝒫𝑣𝑢𝒢\displaystyle\mathop{\rm de}\nolimits(v):=\{u\in V\>:\>\mathcal{P}(v,u)\neq% \emptyset\in\mathcal{G}\},roman_de ( italic_v ) := { italic_u ∈ italic_V : caligraphic_P ( italic_v , italic_u ) ≠ ∅ ∈ caligraphic_G } ,
sib(v)sib𝑣\displaystyle\mathop{\rm sib}\nolimits(v)roman_sib ( italic_v ) :={uV:uv𝒢},assignabsentconditional-set𝑢𝑉absent𝑢𝑣𝒢\displaystyle:=\{u\in V\>:\>u\xleftrightarrow{}v\in\mathcal{G}\},\qquad:= { italic_u ∈ italic_V : italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∈ caligraphic_G } , Sib(v):=sib(v){v}.assignSib𝑣sib𝑣𝑣\displaystyle\mathop{\rm Sib}\nolimits(v):=\mathop{\rm sib}\nolimits(v)\cup\{v\}.roman_Sib ( italic_v ) := roman_sib ( italic_v ) ∪ { italic_v } .

Note that van(v)𝑣an𝑣v\in\mathop{\rm an}\nolimits(v)italic_v ∈ roman_an ( italic_v ) and vde(v)𝑣de𝑣v\in\mathop{\rm de}\nolimits(v)italic_v ∈ roman_de ( italic_v ), via trivial paths. For a subset of vertices UV𝑈𝑉U\subseteq Vitalic_U ⊆ italic_V, we define pa(U)=uUpa(u)pa𝑈subscript𝑢𝑈pa𝑢\mathop{\rm pa}\nolimits(U)=\cup_{u\in U}\mathop{\rm pa}\nolimits(u)roman_pa ( italic_U ) = ∪ start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_pa ( italic_u ) and make the analogous convention for the other relations.

Let U={u1,,un}𝑈subscript𝑢1subscript𝑢𝑛U=\{u_{1},\dots,u_{n}\}italic_U = { italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } and W={w1,,wn}𝑊subscript𝑤1subscript𝑤𝑛W=\{w_{1},\dots,w_{n}\}italic_W = { italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_w start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } be two subsets of V𝑉Vitalic_V that have the same cardinality n𝑛nitalic_n, and for which we have fixed an ordering of their elements. Let Snsubscript𝑆𝑛S_{n}italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be the symmetric group on [n]={1,,n}delimited-[]𝑛1𝑛[n]=\{1,\dots,n\}[ italic_n ] = { 1 , … , italic_n }. We say that Π=(π1,,πn)Πsubscript𝜋1subscript𝜋𝑛\Pi=(\pi_{1},\dots,\pi_{n})roman_Π = ( italic_π start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_π start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) is a system of paths between U𝑈Uitalic_U and W𝑊Witalic_W, if there exists a permutation σΠSnsubscript𝜎Πsubscript𝑆𝑛\sigma_{\Pi}\in S_{n}italic_σ start_POSTSUBSCRIPT roman_Π end_POSTSUBSCRIPT ∈ italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT such that πk𝒫(uk,wσΠ(k))subscript𝜋𝑘𝒫subscript𝑢𝑘subscript𝑤subscript𝜎Π𝑘\pi_{k}\in\mathcal{P}(u_{k},w_{\sigma_{\Pi}(k)})italic_π start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ caligraphic_P ( italic_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_w start_POSTSUBSCRIPT italic_σ start_POSTSUBSCRIPT roman_Π end_POSTSUBSCRIPT ( italic_k ) end_POSTSUBSCRIPT ) for every k[n]𝑘delimited-[]𝑛k\in[n]italic_k ∈ [ italic_n ]. We denote the set of all such systems by 𝒫(U,W)𝒫𝑈𝑊\mathcal{P}(U,W)caligraphic_P ( italic_U , italic_W ). A system Π𝒫(U,W)Π𝒫𝑈𝑊\Pi\in\mathcal{P}(U,W)roman_Π ∈ caligraphic_P ( italic_U , italic_W ) is called non-intersecting if πkπl=subscript𝜋𝑘subscript𝜋𝑙\pi_{k}\cap\pi_{l}=\emptysetitalic_π start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∩ italic_π start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT = ∅ for kl𝑘𝑙k\neq litalic_k ≠ italic_l. The set of all non-intersecting systems in 𝒫(U,W)𝒫𝑈𝑊\mathcal{P}(U,W)caligraphic_P ( italic_U , italic_W ) is denoted by 𝒫~(U,W)~𝒫𝑈𝑊\tilde{\mathcal{P}}(U,W)over~ start_ARG caligraphic_P end_ARG ( italic_U , italic_W ); see Fig. 3 for an example.

(a) u1subscript𝑢1u_{1}italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTu2subscript𝑢2u_{2}italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTw1subscript𝑤1w_{1}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTw2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT           (b) c𝑐citalic_cu1subscript𝑢1u_{1}italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTu2subscript𝑢2u_{2}italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTw1subscript𝑤1w_{1}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTw2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT

Figure 3: (a) A non-intersecting systems of paths from {u1,u2}subscript𝑢1subscript𝑢2\{u_{1},u_{2}\}{ italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } to {w1,w2}subscript𝑤1subscript𝑤2\{w_{1},w_{2}\}{ italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }. (b) Two intersecting paths with node c𝑐citalic_c in their intersection.

When connecting a graph 𝒢𝒢\mathcal{G}caligraphic_G to a statistical model, we will introduce a matrix of parameters whose entries act as weights on the directed edges. We will write 𝒢Dsuperscriptsubscript𝒢𝐷\mathbb{R}^{\mathcal{G}_{D}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT for the set of p×p𝑝𝑝p\times pitalic_p × italic_p real matrices Λ=(λuv)Λsubscript𝜆𝑢𝑣\Lambda=(\lambda_{uv})roman_Λ = ( italic_λ start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT ) such that λuv=0subscript𝜆𝑢𝑣0\lambda_{uv}=0italic_λ start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT = 0 if uvE𝑢𝑣subscript𝐸u\to v\notin E_{\rightarrow{}}italic_u → italic_v ∉ italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT. When 𝒢𝒢\mathcal{G}caligraphic_G is acyclic—as we assume throughout this work, the matrix IΛ𝐼ΛI-\Lambdaitalic_I - roman_Λ is invertible for all Λ𝒢DΛsuperscriptsubscript𝒢𝐷\Lambda\in\mathbb{R}^{\mathcal{G}_{D}}roman_Λ ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT; here, I𝐼Iitalic_I denotes the identity matrix. Indeed, when the nodes of 𝒢𝒢\mathcal{G}caligraphic_G are ordered according to a causal order, (IΛ)Tsuperscript𝐼Λ𝑇(I-\Lambda)^{T}( italic_I - roman_Λ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is lower triangular with all ones on the diagonal and det(IΛ)=1𝐼Λ1\det(I-\Lambda)=1roman_det ( italic_I - roman_Λ ) = 1. We define BΛ:=(IΛ)Tassignsubscript𝐵Λsuperscript𝐼Λ𝑇B_{\Lambda}:=(I-\Lambda)^{-T}italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT := ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT. Later in Section 7, we briefly discuss the identifiability problem in cyclic graphs, where the invertibility of IΛ𝐼ΛI-\Lambdaitalic_I - roman_Λ becomes a modeling assumption.

Finally, let U𝑈Uitalic_U and W𝑊Witalic_W be subsets of the row and column sets of a matrix A𝐴Aitalic_A, respectively. We denote the submatrix containing only the rows in U𝑈Uitalic_U and the columns in W𝑊Witalic_W as AU,Wsubscript𝐴𝑈𝑊A_{U,W}italic_A start_POSTSUBSCRIPT italic_U , italic_W end_POSTSUBSCRIPT.

2 Linear Mixed Graph Models and Identifiability

Let 𝒢=(V,E,E)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}})caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ) be an ADMG, and let 𝒢D=(V,E)subscript𝒢𝐷𝑉subscript𝐸\mathcal{G}_{D}=(V,E_{\rightarrow{}})caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT ) and 𝒢B=(V,E)subscript𝒢𝐵𝑉subscript𝐸\mathcal{G}_{B}=(V,E_{\leftrightarrow{}})caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT = ( italic_V , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ) be its directed and bidirected subgraphs, respectively. We say that a subset CV𝐶𝑉C\subseteq Vitalic_C ⊆ italic_V is connected in the bidirected part 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT if every pair of vertices u,vC𝑢𝑣𝐶u,v\in Citalic_u , italic_v ∈ italic_C is joined by a path in 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT, where every vertex on the path is in C𝐶Citalic_C. On a fixed probability space, let ε=(ε1,,εp)𝜀subscript𝜀1subscript𝜀𝑝\varepsilon=(\varepsilon_{1},\dots,\varepsilon_{p})italic_ε = ( italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ε start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) be a random vector taking values in psuperscript𝑝\mathbb{R}^{p}blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT and satisfying the connected set Markov property with respect to 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT (Richardson, 2003; Drton and Richardson, 2008), that is,

εCεVSib(C)for allCV,C connected in 𝒢B.\varepsilon_{C}\perp\!\!\!\perp\varepsilon_{V\setminus\mathop{\rm Sib}% \nolimits(C)}\ \text{for all}\ \emptyset\neq C\subset V,\ C\text{ connected in% }\mathcal{G}_{B}.italic_ε start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ⟂ ⟂ italic_ε start_POSTSUBSCRIPT italic_V ∖ roman_Sib ( italic_C ) end_POSTSUBSCRIPT for all ∅ ≠ italic_C ⊂ italic_V , italic_C connected in caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT .

We denote the set of all such random vectors by (𝒢B)subscript𝒢𝐵\mathcal{M}(\mathcal{G}_{B})caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). Note that the connected set Markov property implies but is generally stronger than requiring that εuεv\varepsilon_{u}\perp\!\!\!\perp\varepsilon_{v}italic_ε start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ⟂ ⟂ italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT for u,v𝑢𝑣u,vitalic_u , italic_v non-adjacent in 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT.

Definition 2.1.

The linear structural equation model (𝒢)𝒢\mathcal{M}(\mathcal{G})caligraphic_M ( caligraphic_G ) corresponding to a mixed graph 𝒢𝒢\mathcal{G}caligraphic_G is the set of all p𝑝pitalic_p-variate real random vectors X𝑋Xitalic_X (on our fixed probability space) that solve the equation system

X=ΛTX+εX=(IΛ)Tε=BΛε,iff𝑋superscriptΛ𝑇𝑋𝜀𝑋superscript𝐼Λ𝑇𝜀subscript𝐵Λ𝜀\displaystyle X=\Lambda^{T}\cdot X+\varepsilon\iff X=(I-\Lambda)^{-T}\cdot% \varepsilon=B_{\Lambda}\cdot\varepsilon,italic_X = roman_Λ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ italic_X + italic_ε ⇔ italic_X = ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ⋅ italic_ε = italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ⋅ italic_ε ,

for a choice of Λ𝒢DΛsuperscriptsubscript𝒢𝐷\Lambda\in\mathbb{R}^{\mathcal{G}_{D}}roman_Λ ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT and ε(𝒢B)𝜀subscript𝒢𝐵\varepsilon\in\mathcal{M}(\mathcal{G}_{B})italic_ε ∈ caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). The model (𝒢)𝒢\mathcal{M}(\mathcal{G})caligraphic_M ( caligraphic_G ) is thus parametrized by the map

Φ𝒢:𝒢D×(𝒢B):subscriptΦ𝒢superscriptsubscript𝒢𝐷subscript𝒢𝐵\displaystyle\Phi_{\mathcal{G}}:\mathbb{R}^{\mathcal{G}_{D}}\times\mathcal{M}(% \mathcal{G}_{B})roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) (𝒢)absentabsent𝒢\displaystyle\xrightarrow{}\mathcal{M}(\mathcal{G})start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW caligraphic_M ( caligraphic_G )
(Λ,ε)Λ𝜀\displaystyle(\Lambda,\varepsilon)( roman_Λ , italic_ε ) (IΛ)Tε.maps-toabsentsuperscript𝐼Λ𝑇𝜀\displaystyle\mapsto(I-\Lambda)^{-T}\varepsilon.↦ ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_ε .
Example 2.1.

Let 𝒢𝒢\mathcal{G}caligraphic_G be the ADMG from Fig. 2. The set (𝒢B)subscript𝒢𝐵\mathcal{\mathcal{M}}(\mathcal{G}_{B})caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) contains all the random vectors ε=(ε1,ε2,ε3,ε4)𝜀subscript𝜀1subscript𝜀2subscript𝜀3subscript𝜀4\varepsilon=(\varepsilon_{1},\varepsilon_{2},\varepsilon_{3},\varepsilon_{4})italic_ε = ( italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_ε start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_ε start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) such that (ε1,ε2)ε3(\varepsilon_{1},\varepsilon_{2})\perp\!\!\!\perp\varepsilon_{3}( italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ⟂ ⟂ italic_ε start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and ε1(ε3,ε4)\varepsilon_{1}\perp\!\!\!\perp(\varepsilon_{3},\varepsilon_{4})italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟂ ⟂ ( italic_ε start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_ε start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ). The space 𝒢Dsuperscriptsubscript𝒢𝐷\mathbb{R}^{\mathcal{G}_{D}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT is comprised of all matrices of the shape:

Λ=[0λ12λ130000λ24000λ340000].Λmatrix0subscript𝜆12subscript𝜆130000subscript𝜆24000subscript𝜆340000\Lambda=\begin{bmatrix}0&\lambda_{12}&\lambda_{13}&0\\ 0&0&0&\lambda_{24}\\ 0&0&0&\lambda_{34}\\ 0&0&0&0\end{bmatrix}.roman_Λ = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_λ start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT 24 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT 34 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] .

Accordingly, we have

BΛ:=(IΛ)T=[1000λ12100λ13010λ12λ24+λ13λ34λ24λ341].assignsubscript𝐵Λsuperscript𝐼Λ𝑇matrix1000subscript𝜆12100subscript𝜆13010subscript𝜆12subscript𝜆24subscript𝜆13subscript𝜆34subscript𝜆24subscript𝜆341B_{\Lambda}:=(I-\Lambda)^{-T}=\begin{bmatrix}1&0&0&0\\ \lambda_{12}&1&0&0\\ \lambda_{13}&0&1&0\\ \lambda_{12}\lambda_{24}+\lambda_{13}\lambda_{34}&\lambda_{24}&\lambda_{34}&1% \end{bmatrix}.italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT := ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT = [ start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_λ start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_λ start_POSTSUBSCRIPT 24 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT italic_λ start_POSTSUBSCRIPT 34 end_POSTSUBSCRIPT end_CELL start_CELL italic_λ start_POSTSUBSCRIPT 24 end_POSTSUBSCRIPT end_CELL start_CELL italic_λ start_POSTSUBSCRIPT 34 end_POSTSUBSCRIPT end_CELL start_CELL 1 end_CELL end_ROW end_ARG ] .

In this paper, we are concerned with parameter identifiability. In other words, we ask under which conditions on 𝒢𝒢\mathcal{G}caligraphic_G, the distribution of X(𝒢)𝑋𝒢X\in\mathcal{M}(\mathcal{G})italic_X ∈ caligraphic_M ( caligraphic_G ) uniquely determines entries of the coefficient matrix Λ𝒢DΛsuperscriptsubscript𝒢𝐷\Lambda\in\mathbb{R}^{\mathcal{G}_{D}}roman_Λ ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT in the representation X=(IΛ)Tε𝑋superscript𝐼Λ𝑇𝜀X=(I-\Lambda)^{-T}\cdot\varepsilonitalic_X = ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ⋅ italic_ε. While we will not emphasize this in the sequel, the unique determination of all entries of ΛΛ\Lambdaroman_Λ also entails unique recovery of the distribution of ε=(IΛ)TX𝜀superscript𝐼Λ𝑇𝑋\varepsilon=(I-\Lambda)^{T}Xitalic_ε = ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_X.

As noted earlier, our interest is in a generic notion of identifiability, so we ask:

Problem.

Under which graphical conditions on 𝒢𝒢\mathcal{G}caligraphic_G is a set of entries λuvsubscript𝜆𝑢𝑣\lambda_{uv}italic_λ start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT of the parameter matrix ΛΛ\Lambdaroman_Λ generically identifiable?

To detail the problem, we make the following definition and then firm up the involved notion of genericity.

Definition 2.2.

We define the fiber of an element X(𝒢)𝑋𝒢X\in\mathcal{M}(\mathcal{G})italic_X ∈ caligraphic_M ( caligraphic_G ) with respect to Φ𝒢subscriptΦ𝒢\Phi_{\mathcal{G}}roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT as the set

Φ𝒢1(X):={(Λ,ε)𝒢D×(𝒢B):Φ𝒢(Λ,ε)=dX},assignsuperscriptsubscriptΦ𝒢1𝑋conditional-setΛ𝜀superscriptsubscript𝒢𝐷subscript𝒢𝐵superscript𝑑subscriptΦ𝒢Λ𝜀𝑋\Phi_{\mathcal{G}}^{-1}(X):=\{(\Lambda,\varepsilon)\in\mathbb{R}^{\mathcal{G}_% {D}}\times\mathcal{M}(\mathcal{G}_{B})\;:\;\Phi_{\mathcal{G}}(\Lambda,% \varepsilon)\stackrel{{\scriptstyle d}}{{=}}X\},roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_X ) := { ( roman_Λ , italic_ε ) ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) : roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( roman_Λ , italic_ε ) start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP italic_X } , (2.1)

where =dsuperscript𝑑\stackrel{{\scriptstyle d}}{{=}}start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP denotes equality in distribution. Let P𝒢(Φ𝒢1(X))subscriptPsuperscript𝒢superscriptsubscriptΦ𝒢1𝑋\mathrm{P}_{\mathbb{R}^{\mathcal{G}}}(\Phi_{\mathcal{G}}^{-1}(X))roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_X ) ) be the projection of the set onto 𝒢Dsuperscriptsubscript𝒢𝐷\mathbb{R}^{\mathcal{G}_{D}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. A parameter given by a function f(Λ)𝑓Λf(\Lambda)italic_f ( roman_Λ ) is generically identifiable if any generic choice of (Λ,ε)Λ𝜀(\Lambda,\varepsilon)( roman_Λ , italic_ε ) yields a random vector X=(IΛ)Tε𝑋superscript𝐼Λ𝑇𝜀X=(I-\Lambda)^{-T}\varepsilonitalic_X = ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_ε for which it holds that

f(Λ~)=f(Λ)for allΛ~P𝒢(Φ𝒢1(X)).𝑓~Λ𝑓Λfor all~ΛsubscriptPsuperscript𝒢subscriptsuperscriptΦ1𝒢𝑋f(\tilde{\Lambda})=f(\Lambda)\ \text{for all}\ \tilde{\Lambda}\in\mathrm{P}_{% \mathbb{R}^{\mathcal{G}}}(\Phi^{-1}_{\mathcal{G}}(X)).italic_f ( over~ start_ARG roman_Λ end_ARG ) = italic_f ( roman_Λ ) for all over~ start_ARG roman_Λ end_ARG ∈ roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( italic_X ) ) .

Requiring genericity of ΛΛ\Lambdaroman_Λ will mean that we exclude a fixed Lebesgue null set of 𝒢Dsuperscriptsubscript𝒢𝐷\mathbb{R}^{\mathcal{G}_{D}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. For instance, in the instrumental variable (IV) example depicted in Fig. 1, the unknown coefficients are (λ12,λ23)subscript𝜆12subscript𝜆23(\lambda_{12},\lambda_{23})( italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT ) and 𝒢D2superscriptsubscript𝒢𝐷superscript2\mathbb{R}^{\mathcal{G}_{D}}\equiv\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ≡ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Coefficient λ23subscript𝜆23\lambda_{23}italic_λ start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT is identifiable outside the null set given by λ12=0subscript𝜆120\lambda_{12}=0italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT = 0, i.e., we exclude the case that the instrument (X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT) does not affect the exposure (X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT).

While excluding null sets of a finite-dimensional space is a standard approach in related literature (Drton, 2018, §9), speaking of a generic choice of ε𝜀\varepsilonitalic_ε requires clarification as Definition 2.1 is nonparametric with respect to the distribution of the errors ε𝜀\varepsilonitalic_ε. Indeed, our genericity concept has a very specific meaning, namely, that the distribution of ε𝜀\varepsilonitalic_ε satisfies the following assumption.

Assumption 1.

Let ε(𝒢B)𝜀subscript𝒢𝐵\varepsilon\in\mathcal{M}(\mathcal{G}_{B})italic_ε ∈ caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). For every two vectors a1,a2Vsubscript𝑎1subscript𝑎2superscript𝑉a_{1},a_{2}\in\mathbb{R}^{V}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_V end_POSTSUPERSCRIPT, it holds that a1Tεa2Tεa_{1}^{T}\varepsilon\perp\!\!\!\perp a_{2}^{T}\varepsilonitalic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_ε ⟂ ⟂ italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_ε implies that a1ua2v=0subscript𝑎1𝑢subscript𝑎2𝑣0a_{1u}\cdot a_{2v}=0italic_a start_POSTSUBSCRIPT 1 italic_u end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_v end_POSTSUBSCRIPT = 0 whenever u=v𝑢𝑣u=vitalic_u = italic_v oder uvEabsent𝑢𝑣subscript𝐸u\xleftrightarrow{}v\in E_{\leftrightarrow{}}italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∈ italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT.

Our genericity assumption is natural in view of the Darmois-Skitovich theorem. Indeed, this theorem amounts to exactly the statement that if 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT is the empty graph (i.e., has no edges), then 1 holds for every random vector that has at most one normally distributed coordinate. To further justify our assumption, we present in Section 5 a detailed study of two different classes of submodels for which we show that indeed only a lower-dimensional set of distributions is excluded by our assumption. One class of submodels is built by assuming the existence of moments up to an arbitrary but fixed order. The other class is built by assuming linearity of confounding.

For the remainder of this work, whenever we use the term generic, it is implied that the result holds for any matrix ΛΛ\Lambdaroman_Λ outside of a fixed Lebesgue measure zero subset of 𝒢Dsuperscriptsubscript𝒢𝐷\mathbb{R}^{\mathcal{G}_{D}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT and for any ε(𝒢B)𝜀subscript𝒢𝐵\varepsilon\in\mathcal{M}(\mathcal{G}_{B})italic_ε ∈ caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) that satisfies 1.

3 Necessary and Sufficient Conditions for Generic Identifiability of Direct Causal Effects

Let 𝒢=(V,E,E)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}})caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ) be an ADMG, and let X𝑋Xitalic_X be a random vector in the model (𝒢)𝒢\mathcal{M}(\mathcal{G})caligraphic_M ( caligraphic_G ), i.e.,

X=Φ𝒢(Λ,ε)for(Λ,ε)𝒢D×(𝒢B).𝑋subscriptΦ𝒢Λ𝜀forΛ𝜀superscriptsubscript𝒢𝐷subscript𝒢𝐵X=\Phi_{\mathcal{G}}(\Lambda,\varepsilon)\ \text{for}\ (\Lambda,\varepsilon)% \in\mathbb{R}^{\mathcal{G}_{D}}\times\mathcal{M}(\mathcal{G}_{B}).italic_X = roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( roman_Λ , italic_ε ) for ( roman_Λ , italic_ε ) ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) .

Suppose X𝑋Xitalic_X can be generated using another pair (Λ~,ε~)Φ𝒢1(X)~Λ~𝜀superscriptsubscriptΦ𝒢1𝑋(\tilde{\Lambda},\tilde{\varepsilon})\in\Phi_{\mathcal{G}}^{-1}(X)( over~ start_ARG roman_Λ end_ARG , over~ start_ARG italic_ε end_ARG ) ∈ roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_X ). From the definition of the fiber in Eq. 2.1, one can see that

ε~=d(IΛ~)T(IΛ)Tε=(IΛ~)TBΛε=:Aε.\tilde{\varepsilon}\stackrel{{\scriptstyle d}}{{=}}(I-\tilde{\Lambda})^{T}(I-% \Lambda)^{-T}\varepsilon=(I-\tilde{\Lambda})^{T}B_{\Lambda}\varepsilon=:A\varepsilon.over~ start_ARG italic_ε end_ARG start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_ε = ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT italic_ε = : italic_A italic_ε . (3.1)

The next result shows that the entries of matrix A=(IΛ~)TBΛ𝐴superscript𝐼~Λ𝑇subscript𝐵ΛA=(I-\tilde{\Lambda})^{T}B_{\Lambda}italic_A = ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT can be fully specified as a function of both Λ~~Λ\tilde{\Lambda}over~ start_ARG roman_Λ end_ARG and BΛsubscript𝐵ΛB_{\Lambda}italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT through the ancestral relations among the nodes of 𝒢𝒢\mathcal{G}caligraphic_G.

Lemma 3.1.

The entries of matrix A𝐴Aitalic_A defined in Eq. 3.1 can be written as

avu=bvuwpa(v)de(u)λ~wvbwu.subscript𝑎𝑣𝑢subscript𝑏𝑣𝑢subscript𝑤pa𝑣de𝑢subscript~𝜆𝑤𝑣subscript𝑏𝑤𝑢a_{vu}=b_{vu}-\sum_{w\in\\ \mathop{\rm pa}\nolimits(v)\cap\mathop{\rm de}\nolimits(u)}\tilde{\lambda}_{wv% }b_{wu}.italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_w ∈ roman_pa ( italic_v ) ∩ roman_de ( italic_u ) end_POSTSUBSCRIPT over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_w italic_v end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_w italic_u end_POSTSUBSCRIPT . (3.2)

In particular, we have avu=0subscript𝑎𝑣𝑢0a_{vu}=0italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT = 0 if vde(u)𝑣de𝑢v\notin\mathop{\rm de}\nolimits(u)italic_v ∉ roman_de ( italic_u ), and auu=1subscript𝑎𝑢𝑢1a_{uu}=1italic_a start_POSTSUBSCRIPT italic_u italic_u end_POSTSUBSCRIPT = 1 for every uV𝑢𝑉u\in Vitalic_u ∈ italic_V.

Proof.

Writing the product of matrices explicitly, we get

avu=bvuwVλ~wvbwu.subscript𝑎𝑣𝑢subscript𝑏𝑣𝑢subscript𝑤𝑉subscript~𝜆𝑤𝑣subscript𝑏𝑤𝑢a_{vu}=b_{vu}-\sum_{w\in V}\tilde{\lambda}_{wv}b_{wu}.italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_w ∈ italic_V end_POSTSUBSCRIPT over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_w italic_v end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_w italic_u end_POSTSUBSCRIPT .

From the definition of 𝒢Dsuperscriptsubscript𝒢𝐷\mathbb{R}^{\mathcal{G}_{D}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, we know that λ~wv=0subscript~𝜆𝑤𝑣0\tilde{\lambda}_{wv}=0over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_w italic_v end_POSTSUBSCRIPT = 0 if vch(w)𝑣ch𝑤v\notin\mathop{\rm ch}\nolimits(w)italic_v ∉ roman_ch ( italic_w ), while it holds that bwu=0subscript𝑏𝑤𝑢0b_{wu}=0italic_b start_POSTSUBSCRIPT italic_w italic_u end_POSTSUBSCRIPT = 0 if wde(u)𝑤de𝑢w\notin\mathop{\rm de}\nolimits(u)italic_w ∉ roman_de ( italic_u ), from which the claim follows. To see that bwu=0subscript𝑏𝑤𝑢0b_{wu}=0italic_b start_POSTSUBSCRIPT italic_w italic_u end_POSTSUBSCRIPT = 0 if wde(u)𝑤de𝑢w\notin\mathop{\rm de}\nolimits(u)italic_w ∉ roman_de ( italic_u ), note that BΛsubscript𝐵ΛB_{\Lambda}italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT is a path matrix for the directed part 𝒢Dsubscript𝒢𝐷\mathcal{G}_{D}caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT, as we detail in Lemma A.1 in the Appendix. ∎

Definition 3.1.

The set of removable ancestors of a node vV𝑣𝑉v\in Vitalic_v ∈ italic_V is defined as

Rv:={uan(v):Sib(u)Sib(v)}=Sib(VSib(v))an(v).assignsubscript𝑅𝑣conditional-set𝑢an𝑣Sib𝑢Sib𝑣Sib𝑉Sib𝑣an𝑣R_{v}:=\{u\in\mathop{\rm an}\nolimits(v)\>:\>\mathop{\rm Sib}\nolimits(u)% \setminus\mathop{\rm Sib}\nolimits(v)\neq\emptyset\}=\mathop{\rm Sib}\nolimits% (V\setminus\mathop{\rm Sib}\nolimits(v))\cap\mathop{\rm an}\nolimits(v).italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT := { italic_u ∈ roman_an ( italic_v ) : roman_Sib ( italic_u ) ∖ roman_Sib ( italic_v ) ≠ ∅ } = roman_Sib ( italic_V ∖ roman_Sib ( italic_v ) ) ∩ roman_an ( italic_v ) .

Clearly, vRv𝑣subscript𝑅𝑣v\notin R_{v}italic_v ∉ italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT.

Example 3.1.

Consider the graph in Fig. 2. In this graph, the only strict ancestor of v2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, which has only v2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT as its sibling. Hence, Rv2=Sib(v1)Sib(v2)={v1,v2}{v1,v2,v4}=subscript𝑅subscript𝑣2Sibsubscript𝑣1Sibsubscript𝑣2subscript𝑣1subscript𝑣2subscript𝑣1subscript𝑣2subscript𝑣4R_{v_{2}}=\mathop{\rm Sib}\nolimits(v_{1})\setminus\mathop{\rm Sib}\nolimits(v% _{2})=\{v_{1},v_{2}\}\setminus\{v_{1},v_{2},v_{4}\}=\emptysetitalic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = roman_Sib ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ roman_Sib ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } ∖ { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT } = ∅. On the other hand, Rv4={v1,v2}subscript𝑅subscript𝑣4subscript𝑣1subscript𝑣2R_{v_{4}}=\{v_{1},v_{2}\}italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } because v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT belongs to both Sib(v2)Sib(v4)Sibsubscript𝑣2Sibsubscript𝑣4\mathop{\rm Sib}\nolimits(v_{2})\setminus\mathop{\rm Sib}\nolimits(v_{4})roman_Sib ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∖ roman_Sib ( italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) and Sib(v1)Sib(v4)Sibsubscript𝑣1Sibsubscript𝑣4\mathop{\rm Sib}\nolimits(v_{1})\setminus\mathop{\rm Sib}\nolimits(v_{4})roman_Sib ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ roman_Sib ( italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ).

Using the concept of removable ancestors, the next result introduces a linear system of equations whose solution space fully characterizes the parameter matrices in P𝒢(Φ𝒢1(X))subscriptPsuperscript𝒢subscriptsuperscriptΦ1𝒢𝑋\mathrm{P}_{\mathbb{R}^{\mathcal{G}}}(\Phi^{-1}_{\mathcal{G}}(X))roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( italic_X ) ).

Lemma 3.2.

Let X=Φ𝒢(Λ,ε)𝑋subscriptΦ𝒢Λ𝜀X=\Phi_{\mathcal{G}}(\Lambda,\varepsilon)italic_X = roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( roman_Λ , italic_ε ) for a generic choice of parameters (Λ,ε)𝒢D×(𝒢B)Λ𝜀superscriptsubscript𝒢𝐷subscript𝒢𝐵(\Lambda,\varepsilon)\in\mathbb{R}^{\mathcal{G}_{D}}\times\mathcal{M}(\mathcal% {G}_{B})( roman_Λ , italic_ε ) ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). The matrix Λ~𝒢D~Λsuperscriptsubscript𝒢𝐷\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}_{D}}over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT belongs to P𝒢(Φ𝒢1(X))subscriptPsuperscript𝒢superscriptsubscriptΦ𝒢1𝑋\mathrm{P}_{\mathbb{R}^{\mathcal{G}}}(\Phi_{\mathcal{G}}^{-1}(X))roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_X ) ) if and only if it is a solution to the following linear system of equations:

[(BΛ)pa(v),Rv]T(BΛ)vλ~pa(v),v=[(BΛ)v,Rv]T,vV.formulae-sequencesubscriptsuperscriptdelimited-[]subscriptsubscript𝐵Λpa𝑣subscript𝑅𝑣𝑇superscriptsubscript𝐵Λ𝑣subscript~𝜆pa𝑣𝑣superscriptdelimited-[]subscriptsubscript𝐵Λ𝑣subscript𝑅𝑣𝑇for-all𝑣𝑉\underbrace{[(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v),R_{v}}]^{T}}_{(B_{% \Lambda})^{v}}\cdot\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}=[(B_{% \Lambda})_{v,R_{v}}]^{T},\quad\forall v\in V.under⏟ start_ARG [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_POSTSUBSCRIPT ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT = [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , ∀ italic_v ∈ italic_V . (3.3)
Proof.

We start by showing the direct implication. Eq. 3.1 shows that for every v0V{v}subscript𝑣0𝑉𝑣v_{0}\in V\setminus\{v\}italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_V ∖ { italic_v } we can write ε~{v,v0}=A{v,v0},Vεsubscript~𝜀𝑣subscript𝑣0subscript𝐴𝑣subscript𝑣0𝑉𝜀\tilde{\varepsilon}_{\{v,v_{0}\}}=A_{\{v,v_{0}\},V}\cdot\varepsilonover~ start_ARG italic_ε end_ARG start_POSTSUBSCRIPT { italic_v , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT { italic_v , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_V end_POSTSUBSCRIPT ⋅ italic_ε, where A{v,v0},Vsubscript𝐴𝑣subscript𝑣0𝑉A_{\{v,v_{0}\},V}italic_A start_POSTSUBSCRIPT { italic_v , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_V end_POSTSUBSCRIPT denotes the rows of A𝐴Aitalic_A corresponding to nodes {v,v0}𝑣subscript𝑣0\{v,v_{0}\}{ italic_v , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }. Since ε~(𝒢B)~𝜀subscript𝒢𝐵\tilde{\varepsilon}\in\mathcal{M}(\mathcal{G}_{B})over~ start_ARG italic_ε end_ARG ∈ caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), it holds that ε~vε~u0\tilde{\varepsilon}_{v}\perp\!\!\!\perp\tilde{\varepsilon}_{u_{0}}over~ start_ARG italic_ε end_ARG start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ⟂ ⟂ over~ start_ARG italic_ε end_ARG start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT for every u0sib(v)subscript𝑢0sib𝑣u_{0}\notin\mathop{\rm sib}\nolimits(v)italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∉ roman_sib ( italic_v ) and, thus, 1 implies

avv0au0v0subscript𝑎𝑣subscript𝑣0subscript𝑎subscript𝑢0subscript𝑣0\displaystyle a_{vv_{0}}\cdot a_{u_{0}v_{0}}italic_a start_POSTSUBSCRIPT italic_v italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT =0,absent0\displaystyle=0,\qquad= 0 , v0V,for-allsubscript𝑣0𝑉\displaystyle\forall\,v_{0}\in V,∀ italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_V , (3.4)
avv0au0v1subscript𝑎𝑣subscript𝑣0subscript𝑎subscript𝑢0subscript𝑣1\displaystyle a_{vv_{0}}\cdot a_{u_{0}v_{1}}italic_a start_POSTSUBSCRIPT italic_v italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT =0,absent0\displaystyle=0,\qquad= 0 , v0v1𝒢.absentfor-allsubscript𝑣0subscript𝑣1𝒢\displaystyle\forall\,v_{0}\xleftrightarrow{}v_{1}\in\mathcal{G}.∀ italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ caligraphic_G . (3.5)

If usib(v)𝑢sib𝑣u\notin\mathop{\rm sib}\nolimits(v)italic_u ∉ roman_sib ( italic_v ), considering u0=v0=usubscript𝑢0subscript𝑣0𝑢u_{0}=v_{0}=uitalic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_u in Eq. 3.4 yields avuauu=avu=0subscript𝑎𝑣𝑢subscript𝑎𝑢𝑢subscript𝑎𝑣𝑢0a_{vu}\cdot a_{uu}=a_{vu}=0italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_u italic_u end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT = 0, where we used the fact that auu=1subscript𝑎𝑢𝑢1a_{uu}=1italic_a start_POSTSUBSCRIPT italic_u italic_u end_POSTSUBSCRIPT = 1 as a result of Lemma 3.1. Again, from Lemma 3.1, writing avusubscript𝑎𝑣𝑢a_{vu}italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT explicitly, we get

bvu=wpa(v)de(u)λ~wvbwu=[(BΛ)pa(v),u]Tλ~pa(v),v.subscript𝑏𝑣𝑢subscript𝑤pa𝑣de𝑢subscript~𝜆𝑤𝑣subscript𝑏𝑤𝑢superscriptdelimited-[]subscriptsubscript𝐵Λpa𝑣𝑢𝑇subscript~𝜆pa𝑣𝑣b_{vu}=\sum_{w\in\\ \mathop{\rm pa}\nolimits(v)\cap\mathop{\rm de}\nolimits(u)}\tilde{\lambda}_{wv% }b_{wu}=[(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v),u}]^{T}\cdot\tilde{\lambda% }_{\mathop{\rm pa}\nolimits(v),v}.italic_b start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_w ∈ roman_pa ( italic_v ) ∩ roman_de ( italic_u ) end_POSTSUBSCRIPT over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_w italic_v end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_w italic_u end_POSTSUBSCRIPT = [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_u end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT . (3.6)

Now, let usib(v)𝑢sib𝑣u\in\mathop{\rm sib}\nolimits(v)italic_u ∈ roman_sib ( italic_v ), and let wsib(u)sib(v)𝑤sib𝑢sib𝑣w\in\mathop{\rm sib}\nolimits(u)\setminus\mathop{\rm sib}\nolimits(v)italic_w ∈ roman_sib ( italic_u ) ∖ roman_sib ( italic_v ). Considering Eq. 3.5 with u0=v1=wsubscript𝑢0subscript𝑣1𝑤u_{0}=v_{1}=witalic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_w, and v0=usubscript𝑣0𝑢v_{0}=uitalic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_u we get avuaww=avu=0subscript𝑎𝑣𝑢subscript𝑎𝑤𝑤subscript𝑎𝑣𝑢0a_{vu}\cdot a_{ww}=a_{vu}=0italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_w italic_w end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT = 0. Proceeding as above, this yields that uan(v)𝑢an𝑣u\in\mathop{\rm an}\nolimits(v)italic_u ∈ roman_an ( italic_v ) leads to Eq. 3.6, as claimed.

For the reverse implication, consider Λ~𝒢D~Λsuperscriptsubscript𝒢𝐷\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}_{D}}over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT such that each one of its column vectors λ~pa(v),vsubscript~𝜆pa𝑣𝑣\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT is a solution of Eq. 3.3. Define ε~:=Aεassign~𝜀𝐴𝜀\tilde{\varepsilon}:=A\cdot\varepsilonover~ start_ARG italic_ε end_ARG := italic_A ⋅ italic_ε, where the matrix A𝐴Aitalic_A is defined in Eq. 3.1. By the definition of ε~~𝜀\tilde{\varepsilon}over~ start_ARG italic_ε end_ARG, we have Φ𝒢(Λ,ε)=dΦ𝒢(Λ~,ε~)superscript𝑑subscriptΦ𝒢Λ𝜀subscriptΦ𝒢~Λ~𝜀\Phi_{\mathcal{G}}(\Lambda,\varepsilon)\stackrel{{\scriptstyle d}}{{=}}\Phi_{% \mathcal{G}}(\tilde{\Lambda},\tilde{\varepsilon})roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( roman_Λ , italic_ε ) start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( over~ start_ARG roman_Λ end_ARG , over~ start_ARG italic_ε end_ARG ), so it remains to prove ε~(𝒢B)~𝜀subscript𝒢𝐵\tilde{\varepsilon}\in\mathcal{M}(\mathcal{G}_{B})over~ start_ARG italic_ε end_ARG ∈ caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), that is, ε~~𝜀\tilde{\varepsilon}over~ start_ARG italic_ε end_ARG satisfies the connected set Markov property with respect to 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT. Let CV𝐶𝑉C\subseteq Vitalic_C ⊆ italic_V. Then it is easy to see that

ε~C=AC,D(C)εD(C),subscript~𝜀𝐶subscript𝐴𝐶𝐷𝐶subscript𝜀𝐷𝐶\tilde{\varepsilon}_{C}=A_{C,D(C)}\cdot\varepsilon_{D(C)},over~ start_ARG italic_ε end_ARG start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT italic_C , italic_D ( italic_C ) end_POSTSUBSCRIPT ⋅ italic_ε start_POSTSUBSCRIPT italic_D ( italic_C ) end_POSTSUBSCRIPT ,

where D(C)𝐷𝐶D(C)italic_D ( italic_C ) is the set of non-removable ancestors of C𝐶Citalic_C, so

andD(C):=CvC(an(v)Rv)Sib(C).assignand𝐷𝐶𝐶subscript𝑣𝐶an𝑣subscript𝑅𝑣Sib𝐶\quad\text{and}\quad D(C):=C\cup\bigcup_{v\in C}(\mathop{\rm an}\nolimits(v)% \setminus R_{v})\subseteq\mathop{\rm Sib}\nolimits(C).and italic_D ( italic_C ) := italic_C ∪ ⋃ start_POSTSUBSCRIPT italic_v ∈ italic_C end_POSTSUBSCRIPT ( roman_an ( italic_v ) ∖ italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ) ⊆ roman_Sib ( italic_C ) .

Here, the last set inclusion comes from the definition of Rvsubscript𝑅𝑣R_{v}italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT. Hence, to prove that ε~~𝜀\tilde{\varepsilon}over~ start_ARG italic_ε end_ARG satisfies the connected set Markov property, we need to show that ε~Cε~VSib(C)\tilde{\varepsilon}_{C}\perp\!\!\!\perp\tilde{\varepsilon}_{V\setminus\mathop{% \rm Sib}\nolimits(C)}over~ start_ARG italic_ε end_ARG start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ⟂ ⟂ over~ start_ARG italic_ε end_ARG start_POSTSUBSCRIPT italic_V ∖ roman_Sib ( italic_C ) end_POSTSUBSCRIPT whenever C𝐶Citalic_C is a connected subset of 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT. For this, it suffices to show that εD(C)εD(VSib(C))\varepsilon_{D(C)}\perp\!\!\!\perp\varepsilon_{D(V\setminus\mathop{\rm Sib}% \nolimits(C))}italic_ε start_POSTSUBSCRIPT italic_D ( italic_C ) end_POSTSUBSCRIPT ⟂ ⟂ italic_ε start_POSTSUBSCRIPT italic_D ( italic_V ∖ roman_Sib ( italic_C ) ) end_POSTSUBSCRIPT. We will argue that this is indeed the case for C𝐶Citalic_C connected by showing i) D(C)𝐷𝐶D(C)italic_D ( italic_C ) is connected and ii) D(VSib(C))VSib(D(C))𝐷𝑉Sib𝐶𝑉Sib𝐷𝐶D(V\setminus\mathop{\rm Sib}\nolimits(C))\subseteq V\setminus\mathop{\rm Sib}% \nolimits(D(C))italic_D ( italic_V ∖ roman_Sib ( italic_C ) ) ⊆ italic_V ∖ roman_Sib ( italic_D ( italic_C ) ). The asserted result will then follow from the fact that ε𝜀\varepsilonitalic_ε satisfies the connected set Markov property with respect to 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT.

i) To show that D(C)𝐷𝐶D(C)italic_D ( italic_C ) is connected consider u,vC𝑢𝑣𝐶u,v\in Citalic_u , italic_v ∈ italic_C. From the definition of D(C)𝐷𝐶D(C)italic_D ( italic_C ), one can see that there are u0,v0Csubscript𝑢0subscript𝑣0𝐶u_{0},v_{0}\in Citalic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_C such that u0Sib(u)subscript𝑢0Sib𝑢u_{0}\in\mathop{\rm Sib}\nolimits(u)italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ roman_Sib ( italic_u ) and v0Sib(v)subscript𝑣0Sib𝑣v_{0}\in\mathop{\rm Sib}\nolimits(v)italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ roman_Sib ( italic_v ). Because C𝐶Citalic_C is connected, a bidirected path joining u0subscript𝑢0u_{0}italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and v0subscript𝑣0v_{0}italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT exists over C𝐶Citalic_C, and we can extend the path to suitably join u𝑢uitalic_u and v𝑣vitalic_v as well.

ii) Notice that Sib(D(C))Sib(C)Sib𝐷𝐶Sib𝐶\mathop{\rm Sib}\nolimits(D(C))\subseteq\mathop{\rm Sib}\nolimits(C)roman_Sib ( italic_D ( italic_C ) ) ⊆ roman_Sib ( italic_C ). This is because if wSib(D(C))𝑤Sib𝐷𝐶w\in\mathop{\rm Sib}\nolimits(D(C))italic_w ∈ roman_Sib ( italic_D ( italic_C ) ), either wSib(C)𝑤Sib𝐶w\in\mathop{\rm Sib}\nolimits(C)italic_w ∈ roman_Sib ( italic_C ) or there is vC𝑣𝐶v\in Citalic_v ∈ italic_C such that wSib(an(v)Rv)𝑤Siban𝑣subscript𝑅𝑣w\in\mathop{\rm Sib}\nolimits(\mathop{\rm an}\nolimits(v)\setminus R_{v})italic_w ∈ roman_Sib ( roman_an ( italic_v ) ∖ italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ) which again implies that wSib(C)𝑤Sib𝐶w\in\mathop{\rm Sib}\nolimits(C)italic_w ∈ roman_Sib ( italic_C ) by the definition of Rvsubscript𝑅𝑣R_{v}italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT. This implies that in order to prove D(VSib(C))VSib(D(C))𝐷𝑉Sib𝐶𝑉Sib𝐷𝐶D(V\setminus\mathop{\rm Sib}\nolimits(C))\subseteq V\setminus\mathop{\rm Sib}% \nolimits(D(C))italic_D ( italic_V ∖ roman_Sib ( italic_C ) ) ⊆ italic_V ∖ roman_Sib ( italic_D ( italic_C ) ), we only need to show D(VSib(C))Sib(C)=𝐷𝑉Sib𝐶Sib𝐶D(V\setminus\mathop{\rm Sib}\nolimits(C))\cap\mathop{\rm Sib}\nolimits(C)=\emptysetitalic_D ( italic_V ∖ roman_Sib ( italic_C ) ) ∩ roman_Sib ( italic_C ) = ∅. Suppose there exists uD(VSib(C))Sib(C)𝑢𝐷𝑉Sib𝐶Sib𝐶u\in D(V\setminus\mathop{\rm Sib}\nolimits(C))\cap\mathop{\rm Sib}\nolimits(C)italic_u ∈ italic_D ( italic_V ∖ roman_Sib ( italic_C ) ) ∩ roman_Sib ( italic_C ), then there are vC𝑣𝐶v\in Citalic_v ∈ italic_C and wVSib(C)𝑤𝑉Sib𝐶w\in V\setminus\mathop{\rm Sib}\nolimits(C)italic_w ∈ italic_V ∖ roman_Sib ( italic_C ) such that vuw𝒢Babsent𝑣𝑢absent𝑤subscript𝒢𝐵v\xleftrightarrow{}u\xleftrightarrow{}w\in\mathcal{G}_{B}italic_v start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_w ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT, and uRw𝑢subscript𝑅𝑤u\in R_{w}italic_u ∈ italic_R start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT. By the definition of Rwsubscript𝑅𝑤R_{w}italic_R start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT, this implies vsib(w)𝑣sib𝑤v\in\mathop{\rm sib}\nolimits(w)italic_v ∈ roman_sib ( italic_w ) which is impossible as wVSib(C)𝑤𝑉Sib𝐶w\in V\setminus\mathop{\rm Sib}\nolimits(C)italic_w ∈ italic_V ∖ roman_Sib ( italic_C ). This concludes the proof. ∎

Definition 3.2.

Let vV𝑣𝑉v\in Vitalic_v ∈ italic_V, and let Q(pa(v){v})𝑄pa𝑣𝑣Q\subseteq(\mathop{\rm pa}\nolimits(v)\cup\{v\})italic_Q ⊆ ( roman_pa ( italic_v ) ∪ { italic_v } ). We define the v𝑣vitalic_v-rank of Q𝑄Qitalic_Q as

rQv:=max1k|Q|{(I,P)2Rv×2Q:|I|=|P|=k,𝒫~(I,P)},assignsubscriptsuperscript𝑟𝑣𝑄subscript1𝑘𝑄:𝐼𝑃superscript2subscript𝑅𝑣superscript2𝑄𝐼𝑃𝑘~𝒫𝐼𝑃r^{v}_{Q}:=\max_{1\leq k\leq|Q|}\{\,(I,P)\in 2^{R_{v}}\times 2^{Q}\>:\>|I|=|P|% =k,\,\,\tilde{\mathcal{P}}(I,P)\neq\emptyset\},italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT := roman_max start_POSTSUBSCRIPT 1 ≤ italic_k ≤ | italic_Q | end_POSTSUBSCRIPT { ( italic_I , italic_P ) ∈ 2 start_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × 2 start_POSTSUPERSCRIPT italic_Q end_POSTSUPERSCRIPT : | italic_I | = | italic_P | = italic_k , over~ start_ARG caligraphic_P end_ARG ( italic_I , italic_P ) ≠ ∅ } , (3.7)

where 2Ssuperscript2𝑆2^{S}2 start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT denotes the power set of S𝑆Sitalic_S. Recall that 𝒫~(I,P)~𝒫𝐼𝑃\tilde{\mathcal{P}}(I,P)over~ start_ARG caligraphic_P end_ARG ( italic_I , italic_P ) is a set of non-intersecting systems of paths.

Notice that from Eq. 3.7 it is immediate that

rpa(v)Qvrpa(v)v|Q|.subscriptsuperscript𝑟𝑣pa𝑣𝑄subscriptsuperscript𝑟𝑣pa𝑣𝑄r^{v}_{\mathop{\rm pa}\nolimits(v)\setminus Q}\geq r^{v}_{\mathop{\rm pa}% \nolimits(v)}-|Q|.italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) ∖ italic_Q end_POSTSUBSCRIPT ≥ italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) end_POSTSUBSCRIPT - | italic_Q | .

The following theorem, which constitutes our main identifiability result, shows that this lower bound for rpa(v)Qvsubscriptsuperscript𝑟𝑣pa𝑣𝑄r^{v}_{\mathop{\rm pa}\nolimits(v)\setminus Q}italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) ∖ italic_Q end_POSTSUBSCRIPT is reached if and only if λQ,vsubscript𝜆𝑄𝑣\lambda_{Q,v}italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT is generically identifiable. The theorem is based on characterizing the linear subspace of the solution set of Eq. 3.3, which describes P𝒢(Φ1(X))subscriptPsuperscript𝒢subscriptΦ1𝑋\mathrm{P}_{\mathbb{R}^{\mathcal{G}}}(\Phi_{-1}(X))roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT ( italic_X ) ) based on the previous lemma.

Theorem 3.3.

Let vV𝑣𝑉v\in Vitalic_v ∈ italic_V, and let Qpa(v)𝑄pa𝑣Q\subseteq\mathop{\rm pa}\nolimits(v)italic_Q ⊆ roman_pa ( italic_v ). The vector λQ,vsubscript𝜆𝑄𝑣\lambda_{Q,v}italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT is generically identifiable if and only if rpa(v)Qv=rpa(v)v|Q|subscriptsuperscript𝑟𝑣pa𝑣𝑄subscriptsuperscript𝑟𝑣pa𝑣𝑄r^{v}_{\mathop{\rm pa}\nolimits(v)\setminus Q}=r^{v}_{\mathop{\rm pa}\nolimits% (v)}-|Q|italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) ∖ italic_Q end_POSTSUBSCRIPT = italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) end_POSTSUBSCRIPT - | italic_Q |, where rQvsubscriptsuperscript𝑟𝑣𝑄r^{v}_{Q}italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is defined in Eq. 3.7.

Proof.

The vector λQ,vsubscript𝜆𝑄𝑣\lambda_{Q,v}italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT is identifiable if and only λ~Q,v=λQ,vsubscript~𝜆𝑄𝑣subscript𝜆𝑄𝑣\tilde{\lambda}_{Q,v}=\lambda_{Q,v}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT = italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT, for every Λ~P𝒢(Φ𝒢1(X))~ΛsubscriptPsuperscript𝒢subscriptsuperscriptΦ1𝒢𝑋\tilde{\Lambda}\in\mathrm{P}_{\mathbb{R}^{\mathcal{G}}}(\Phi^{-1}_{\mathcal{G}% }(X))over~ start_ARG roman_Λ end_ARG ∈ roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( italic_X ) ). We know from Lemma 3.2 that λ~pa(v),vsubscript~𝜆pa𝑣𝑣\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT is a solution of the linear system given in Eq. 3.3 for every such matrix Λ~~Λ\tilde{\Lambda}over~ start_ARG roman_Λ end_ARG. Hence, if we define

Svsuperscript𝑆𝑣\displaystyle S^{v}italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT :={λ~pa(v),v|pa(v)|:[(BΛ)pa(v),Rv]Tλ~pa(v),v=[(BΛ)v,Rv]T},assignabsentconditional-setsubscript~𝜆pa𝑣𝑣superscriptpa𝑣superscriptdelimited-[]subscriptsubscript𝐵Λpa𝑣subscript𝑅𝑣𝑇subscript~𝜆pa𝑣𝑣superscriptdelimited-[]subscriptsubscript𝐵Λ𝑣subscript𝑅𝑣𝑇\displaystyle:=\{\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}\in\mathbb{R}^% {|\mathop{\rm pa}\nolimits(v)|}\>:\>[(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v% ),R_{v}}]^{T}\cdot\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}=[(B_{\Lambda% })_{v,R_{v}}]^{T}\},:= { over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT | roman_pa ( italic_v ) | end_POSTSUPERSCRIPT : [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT = [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT } ,
SQvsubscriptsuperscript𝑆𝑣𝑄\displaystyle S^{v}_{Q}italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT :={λ~pa(v),vSv:λ~Q,v=λQ,v},assignabsentconditional-setsubscript~𝜆pa𝑣𝑣superscript𝑆𝑣subscript~𝜆𝑄𝑣subscript𝜆𝑄𝑣\displaystyle:=\{\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}\in S^{v}\>:\>% \tilde{\lambda}_{Q,v}=\lambda_{Q,v}\},:= { over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT ∈ italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT : over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT = italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT } ,

then λQ,vsubscript𝜆𝑄𝑣\lambda_{Q,v}italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT is identifiable if and only if SQv=Svsubscriptsuperscript𝑆𝑣𝑄superscript𝑆𝑣S^{v}_{Q}=S^{v}italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT = italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT. By definition, SQvsubscriptsuperscript𝑆𝑣𝑄S^{v}_{Q}italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is a linear subspace of Svsuperscript𝑆𝑣S^{v}italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT, so the two are equal if and only if they have the same dimension.

We can write SQvsubscriptsuperscript𝑆𝑣𝑄S^{v}_{Q}italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT as the solution space of the following linear system

[(Ip)Q,[p](BΛ)v](BΛ)Qvλ~pa(v),v=[λQ,v[(BΛ)v,Rv]T],subscriptmatrixsubscriptsubscript𝐼𝑝𝑄delimited-[]𝑝superscriptsubscript𝐵Λ𝑣subscriptsuperscriptsubscript𝐵Λ𝑣𝑄subscript~𝜆pa𝑣𝑣matrixsubscript𝜆𝑄𝑣superscriptdelimited-[]subscriptsubscript𝐵Λ𝑣subscript𝑅𝑣𝑇\underbrace{\begin{bmatrix}(I_{p})_{Q,[p]}\\ (B_{\Lambda})^{v}\end{bmatrix}}_{(B_{\Lambda})^{v}_{Q}}\cdot\tilde{\lambda}_{% \mathop{\rm pa}\nolimits(v),v}=\begin{bmatrix}\lambda_{Q,v}\\ [(B_{\Lambda})_{v,R_{v}}]^{T}\end{bmatrix},under⏟ start_ARG [ start_ARG start_ROW start_CELL ( italic_I start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_Q , [ italic_p ] end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] , (3.8)

where Ipsubscript𝐼𝑝I_{p}italic_I start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the p×p𝑝𝑝p\times pitalic_p × italic_p identity matrix, and (BΛ)vsuperscriptsubscript𝐵Λ𝑣(B_{\Lambda})^{v}( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT is defined in Eq. 3.3. We know that the solution space of Eq. 3.8 is not empty since λQ,vsubscript𝜆𝑄𝑣\lambda_{Q,v}italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT belongs to it. Hence, we have dim(SQv)=|pa(v)|rank((BΛ)Qv)dimensionsubscriptsuperscript𝑆𝑣𝑄pa𝑣ranksubscriptsuperscriptsubscript𝐵Λ𝑣𝑄\dim(S^{v}_{Q})=|\mathop{\rm pa}\nolimits(v)|-\operatorname{rank}((B_{\Lambda}% )^{v}_{Q})roman_dim ( italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ) = | roman_pa ( italic_v ) | - roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ), which implies

dim(SQv)=dim(Sv)rank((BΛ)Qv)=rank((BΛ)v).iffdimensionsubscriptsuperscript𝑆𝑣𝑄dimensionsuperscript𝑆𝑣ranksubscriptsuperscriptsubscript𝐵Λ𝑣𝑄ranksuperscriptsubscript𝐵Λ𝑣\dim(S^{v}_{Q})=\dim(S^{v})\iff\operatorname{rank}((B_{\Lambda})^{v}_{Q})=% \operatorname{rank}((B_{\Lambda})^{v}).roman_dim ( italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ) = roman_dim ( italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ) ⇔ roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ) = roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ) .

From the definition of (BΛ)Qvsubscriptsuperscriptsubscript𝐵Λ𝑣𝑄(B_{\Lambda})^{v}_{Q}( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT in Eq. 3.8 one can easily see that

rank((BΛ)Qv)=rank([(BΛ)Rv,pa(v)Qv])+|Q|=rank([(BΛ)pa(v)Q,Rv]T)+|Q|.ranksubscriptsuperscriptsubscript𝐵Λ𝑣𝑄rankdelimited-[]subscriptsuperscriptsubscript𝐵Λ𝑣subscript𝑅𝑣pa𝑣𝑄𝑄ranksuperscriptdelimited-[]subscriptsubscript𝐵Λpa𝑣𝑄subscript𝑅𝑣𝑇𝑄\operatorname{rank}((B_{\Lambda})^{v}_{Q})=\operatorname{rank}([(B_{\Lambda})^% {v}_{R_{v},\mathop{\rm pa}\nolimits(v)\setminus Q}])+|Q|=\operatorname{rank}([% (B_{\Lambda})_{\mathop{\rm pa}\nolimits(v)\setminus Q,R_{v}}]^{T})+|Q|.roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ) = roman_rank ( [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT , roman_pa ( italic_v ) ∖ italic_Q end_POSTSUBSCRIPT ] ) + | italic_Q | = roman_rank ( [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v ) ∖ italic_Q , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) + | italic_Q | .

Finally, we have

dim(SQv)=dim(Sv)rank([(BΛ)pa(v)Q,Rv]T)=rank([(BΛ)pa(v),Rv]T)|Q|,iffdimensionsubscriptsuperscript𝑆𝑣𝑄dimensionsuperscript𝑆𝑣ranksuperscriptdelimited-[]subscriptsubscript𝐵Λpa𝑣𝑄subscript𝑅𝑣𝑇ranksuperscriptdelimited-[]subscriptsubscript𝐵Λpa𝑣subscript𝑅𝑣𝑇𝑄\displaystyle\dim(S^{v}_{Q})=\dim(S^{v})\iff\operatorname{rank}([(B_{\Lambda})% _{\mathop{\rm pa}\nolimits(v)\setminus\ Q,R_{v}}]^{T})=\operatorname{rank}([(B% _{\Lambda})_{\mathop{\rm pa}\nolimits(v),R_{v}}]^{T})-|Q|,roman_dim ( italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ) = roman_dim ( italic_S start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ) ⇔ roman_rank ( [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v ) ∖ italic_Q , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) = roman_rank ( [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) - | italic_Q | ,

which concludes the proof by noticing that from Lemma A.2, we have rQvsubscriptsuperscript𝑟𝑣𝑄r^{v}_{Q}italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is generically equal to rank([(BΛ)Q,Rv]T)ranksuperscriptdelimited-[]subscriptsubscript𝐵Λ𝑄subscript𝑅𝑣𝑇\operatorname{rank}([(B_{\Lambda})_{Q,R_{v}}]^{T})roman_rank ( [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_Q , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) for every subset Q𝑄Qitalic_Q of pa(v)pa𝑣\mathop{\rm pa}\nolimits(v)roman_pa ( italic_v ). ∎

Example 3.2.

Consider again the graph in Fig. 2, as in Example 3.1. We have Rv2=subscript𝑅subscript𝑣2R_{v_{2}}=\emptysetitalic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∅, implying that the parameter λv1v2subscript𝜆subscript𝑣1subscript𝑣2\lambda_{v_{1}v_{2}}italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is not identifiable. In contrast Rv4={v1,v2}subscript𝑅subscript𝑣4subscript𝑣1subscript𝑣2R_{v_{4}}=\{v_{1},v_{2}\}italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }, and there is a system of non-intersecting paths from Rv4subscript𝑅subscript𝑣4R_{v_{4}}italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT to pa(v4)={v2,v3}pasubscript𝑣4subscript𝑣2subscript𝑣3\mathop{\rm pa}\nolimits(v_{4})=\{v_{2},v_{3}\}roman_pa ( italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) = { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT } given by π1=(v1,v3)subscript𝜋1subscript𝑣1subscript𝑣3\pi_{1}=(v_{1},v_{3})italic_π start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) and π2=(v2,v2)subscript𝜋2subscript𝑣2subscript𝑣2\pi_{2}=(v_{2},v_{2})italic_π start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). This implies that the vector λpa(v),vsubscript𝜆pa𝑣𝑣\lambda_{\mathop{\rm pa}\nolimits(v),v}italic_λ start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT is identifiable.

The following theorem characterizes the situations in which the whole matrix ΛΛ\Lambdaroman_Λ is identifiable.

Theorem 3.4.

The matrix ΛΛ\Lambdaroman_Λ is generically identifiable if and only if for every node vV𝑣𝑉v\in Vitalic_v ∈ italic_V, there is a subset Ivsubscript𝐼𝑣I_{v}italic_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT of Rvsubscript𝑅𝑣R_{v}italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT of size |pa(v)|pa𝑣|\mathop{\rm pa}\nolimits(v)|| roman_pa ( italic_v ) | such that there is a system of non-intersecting paths from Ivsubscript𝐼𝑣I_{v}italic_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT to pa(v)pa𝑣\mathop{\rm pa}\nolimits(v)roman_pa ( italic_v ).

Proof.

The matrix ΛΛ\Lambdaroman_Λ is identifiable if and only if all of its columns are, so we get the statement by applying Theorem 3.3 to each of the columns, with Q=pa(v)𝑄pa𝑣Q=\mathop{\rm pa}\nolimits(v)italic_Q = roman_pa ( italic_v ). ∎

Remark 3.1.

It is noteworthy that 1 is used only for proving the direct implication of Lemma 3.2. This implies that the necessity of the graphical condition in Theorem 3.4 also holds if the model was extended by not requiring 1 to hold.

Remark 3.2.

A direct consequence of Lemma 3.2 is that if the matrix ΛΛ\Lambdaroman_Λ is not generically identifiable, the fiber P𝒢(Φ𝒢1(Φ𝒢(Λ,ε)))subscriptPsuperscript𝒢superscriptsubscriptΦ𝒢1subscriptΦ𝒢Λ𝜀\mathrm{P}_{\mathbb{R}^{\mathcal{G}}}(\Phi_{\mathcal{G}}^{-1}(\Phi_{\mathcal{G% }}(\Lambda,\varepsilon)))roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( roman_Λ , italic_ε ) ) ) has infinite cardinality. This implies that in our setting, there are no ADMGs that are k𝑘kitalic_k-to-one with finite k>1𝑘1k>1italic_k > 1. This is in contrast with the linear Gaussian case; see e.g., Foygel et al. (2012, Ex. 8).

4 Certifying Identifiability

Verifying directly whether the condition of Theorem 3.3 is satisfied can be computationally challenging. Following the approach of Brito (2004) and Foygel et al. (2012), we now introduce an alternative approach that can verify the identifiability condition of Theorem 3.3 in polynomial time in the size of the graph via a maximum flow reformulation.

For the sake of completeness, we first revisit the definition of the maximum flow problem; further details are available in Cormen et al. (2009, §26). Subsequently, we introduce our reformulation.

The proofs of the results presented in this section can be found in Section B.1.

4.1 The Maximum Flow Problem

Let G=(V,D)𝐺𝑉𝐷G=(V,D)italic_G = ( italic_V , italic_D ) be a directed graph with source node sV𝑠𝑉s\in Vitalic_s ∈ italic_V and sink node tV𝑡𝑉t\in Vitalic_t ∈ italic_V. Let cV:V0:subscript𝑐𝑉𝑉subscriptabsent0c_{V}:V\to\mathbb{R}_{\geq 0}italic_c start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT : italic_V → blackboard_R start_POSTSUBSCRIPT ≥ 0 end_POSTSUBSCRIPT be a node capacity function, and let cD:D0:subscript𝑐𝐷𝐷subscriptabsent0c_{D}:D\to\mathbb{R}_{\geq 0}italic_c start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT : italic_D → blackboard_R start_POSTSUBSCRIPT ≥ 0 end_POSTSUBSCRIPT be an edge capacity function. A flow on G𝐺Gitalic_G is a function f:D0:𝑓𝐷subscriptabsent0f:D\to\mathbb{R}_{\geq 0}italic_f : italic_D → blackboard_R start_POSTSUBSCRIPT ≥ 0 end_POSTSUBSCRIPT satisfying

wch(v)f(v,w)=upa(v)f(u,v)subscript𝑤ch𝑣𝑓𝑣𝑤subscript𝑢pa𝑣𝑓𝑢𝑣\displaystyle\sum_{w\in\mathop{\rm ch}\nolimits(v)}f(v,w)=\sum_{u\in\mathop{% \rm pa}\nolimits(v)}f(u,v)∑ start_POSTSUBSCRIPT italic_w ∈ roman_ch ( italic_v ) end_POSTSUBSCRIPT italic_f ( italic_v , italic_w ) = ∑ start_POSTSUBSCRIPT italic_u ∈ roman_pa ( italic_v ) end_POSTSUBSCRIPT italic_f ( italic_u , italic_v ) cV(v),vV{s,t},formulae-sequenceabsentsubscript𝑐𝑉𝑣for-all𝑣𝑉𝑠𝑡\displaystyle\leq c_{V}(v),\quad\forall v\in V\setminus\{s,t\},≤ italic_c start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT ( italic_v ) , ∀ italic_v ∈ italic_V ∖ { italic_s , italic_t } , (4.1)
f(u,v)𝑓𝑢𝑣\displaystyle f(u,v)italic_f ( italic_u , italic_v ) cD(u,v),uvD.formulae-sequenceabsentsubscript𝑐𝐷𝑢𝑣for-all𝑢𝑣𝐷\displaystyle\leq c_{D}(u,v),\quad\forall u\to v\in D.≤ italic_c start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_u , italic_v ) , ∀ italic_u → italic_v ∈ italic_D .

The size of a flow f𝑓fitalic_f is defined as

|f|:=wch(s)f(s,w)=upa(t)f(u,t).assign𝑓subscript𝑤ch𝑠𝑓𝑠𝑤subscript𝑢pa𝑡𝑓𝑢𝑡|f|:=\sum_{w\in\mathop{\rm ch}\nolimits(s)}f(s,w)=\sum_{u\in\mathop{\rm pa}% \nolimits(t)}f(u,t).| italic_f | := ∑ start_POSTSUBSCRIPT italic_w ∈ roman_ch ( italic_s ) end_POSTSUBSCRIPT italic_f ( italic_s , italic_w ) = ∑ start_POSTSUBSCRIPT italic_u ∈ roman_pa ( italic_t ) end_POSTSUBSCRIPT italic_f ( italic_u , italic_t ) . (4.2)

The max-flow problem on (G,s,t,cV,cD)𝐺𝑠𝑡subscript𝑐𝑉subscript𝑐𝐷(G,s,t,c_{V},c_{D})( italic_G , italic_s , italic_t , italic_c start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ) is the problem of finding a flow f𝑓fitalic_f whose size |f|𝑓|f|| italic_f | is maximum.

4.2 Deciding Generic Identifiability

For every node vV𝑣𝑉v\in Vitalic_v ∈ italic_V and every Qpa(v)𝑄pa𝑣Q\subseteq\mathop{\rm pa}\nolimits(v)italic_Q ⊆ roman_pa ( italic_v ), let GQv=(VQv,EQv)subscriptsuperscript𝐺𝑣𝑄subscriptsuperscript𝑉𝑣𝑄subscriptsuperscript𝐸𝑣𝑄G^{v}_{Q}=(V^{v}_{Q},E^{v}_{Q})italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT = ( italic_V start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT , italic_E start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ) be defined as follows:

VQv:=assignsubscriptsuperscript𝑉𝑣𝑄absent\displaystyle V^{v}_{Q}:=italic_V start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT := an(v){sv,tv},an𝑣subscript𝑠𝑣subscript𝑡𝑣\displaystyle\mathop{\rm an}\nolimits(v)\cup\{s_{v},t_{v}\},roman_an ( italic_v ) ∪ { italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT } ,
EQv:=assignsubscriptsuperscript𝐸𝑣𝑄absent\displaystyle E^{v}_{Q}:=italic_E start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT := {svu:uRv}{utv:uQ}{uw:uw𝒢},conditional-setsubscript𝑠𝑣𝑢𝑢subscript𝑅𝑣conditional-set𝑢subscript𝑡𝑣𝑢𝑄conditional-set𝑢𝑤𝑢𝑤𝒢\displaystyle\{s_{v}\to u\>:\>u\in R_{v}\}\cup\{u\to t_{v}\>:\>u\in Q\}\cup\{u% \to w\>:\>u\to w\in\mathcal{G}\},{ italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT → italic_u : italic_u ∈ italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT } ∪ { italic_u → italic_t start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT : italic_u ∈ italic_Q } ∪ { italic_u → italic_w : italic_u → italic_w ∈ caligraphic_G } ,

where svsubscript𝑠𝑣s_{v}italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT and tvsubscript𝑡𝑣t_{v}italic_t start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT are, respectively, newly introduced source and sink nodes. The edge capacity is \infty for all the edges. The node capacity is \infty for both the sink and the source, and 1111, otherwise. We denote the maximum size of any flow on GQvsubscriptsuperscript𝐺𝑣𝑄G^{v}_{Q}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT by maxflow(GQv)maxflowsubscriptsuperscript𝐺𝑣𝑄\,\operatorname{max-flow}{(G^{v}_{Q})}start_OPFUNCTION roman_max - roman_flow end_OPFUNCTION ( italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ).

Lemma 4.1.

It holds that maxflow(GQv)=rQvmaxflowsubscriptsuperscript𝐺𝑣𝑄subscriptsuperscript𝑟𝑣𝑄\operatorname{max-flow}{(G^{v}_{Q})}=r^{v}_{Q}start_OPFUNCTION roman_max - roman_flow end_OPFUNCTION ( italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ) = italic_r start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT.

Theorem 4.2.

Given a mixed graph 𝒢=(V,E,E)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}})caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ), a node vV𝑣𝑉v\in Vitalic_v ∈ italic_V, and any Qpa(v)𝑄pa𝑣Q\subseteq\mathop{\rm pa}\nolimits(v)italic_Q ⊆ roman_pa ( italic_v ), the generic identifiability of λQ,vsubscript𝜆𝑄𝑣\lambda_{Q,v}italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT holds if and only if maxflow(GQv)=|Q|maxflowsubscriptsuperscript𝐺𝑣𝑄𝑄\operatorname{max-flow}{(G^{v}_{Q})}=|Q|start_OPFUNCTION roman_max - roman_flow end_OPFUNCTION ( italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ) = | italic_Q |, which can be certified in 𝒪(|V|2+o(1))𝒪superscript𝑉2𝑜1\mathcal{O}(|V|^{2+o(1)})caligraphic_O ( | italic_V | start_POSTSUPERSCRIPT 2 + italic_o ( 1 ) end_POSTSUPERSCRIPT ) time.

Theorem 4.3.

Given a mixed graph 𝒢=(V,E,E)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}})caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ), the generic identifiability of ΛΛ\Lambdaroman_Λ holds if and only if maxflow(Gpa(v)v)=|pa(v)|maxflowsubscriptsuperscript𝐺𝑣pa𝑣pa𝑣\operatorname{max-flow}{(G^{v}_{\mathop{\rm pa}\nolimits(v)})}=|\mathop{\rm pa% }\nolimits(v)|start_OPFUNCTION roman_max - roman_flow end_OPFUNCTION ( italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) end_POSTSUBSCRIPT ) = | roman_pa ( italic_v ) | for all vV𝑣𝑉v\in Vitalic_v ∈ italic_V, which can be certified in 𝒪(|V|3+o(1))𝒪superscript𝑉3𝑜1\mathcal{O}(|V|^{3+o(1)})caligraphic_O ( | italic_V | start_POSTSUPERSCRIPT 3 + italic_o ( 1 ) end_POSTSUPERSCRIPT ) time.

Example 4.1.

Fig. 4 illustrates the maximum flows when the criterion from Theorem 4.2 to two of the nodes of the ADMG in Fig. 2.

Gpa(v2)v2::subscriptsuperscript𝐺subscript𝑣2pasubscript𝑣2absentG^{v_{2}}_{\mathop{\rm pa}\nolimits(v_{2})}:italic_G start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT : The graph is constructed for parameter λ12subscript𝜆12\lambda_{12}italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT. The only flow on Gpa(v2)v2subscriptsuperscript𝐺subscript𝑣2pasubscript𝑣2G^{v_{2}}_{\mathop{\rm pa}\nolimits(v_{2})}italic_G start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT is the trivial flow setting all edges to 00. Hence, λ12subscript𝜆12\lambda_{12}italic_λ start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT is not identifiable.

Gpa(v4)v4::subscriptsuperscript𝐺subscript𝑣4pasubscript𝑣4absentG^{v_{4}}_{\mathop{\rm pa}\nolimits(v_{4})}:italic_G start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT : The graph is constructed for parameter Λ{2,3},4subscriptΛ234\Lambda_{\{2,3\},4}roman_Λ start_POSTSUBSCRIPT { 2 , 3 } , 4 end_POSTSUBSCRIPT. The figure displays a flow on Gpa(v4)v4subscriptsuperscript𝐺subscript𝑣4pasubscript𝑣4G^{v_{4}}_{\mathop{\rm pa}\nolimits(v_{4})}italic_G start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT of size |pa(3)|=2pa32|\mathop{\rm pa}\nolimits(3)|=2| roman_pa ( 3 ) | = 2. Consequently, the parameters λ24subscript𝜆24\lambda_{24}italic_λ start_POSTSUBSCRIPT 24 end_POSTSUBSCRIPT and λ34subscript𝜆34\lambda_{34}italic_λ start_POSTSUBSCRIPT 34 end_POSTSUBSCRIPT are identifiable.

Gpa(v2)v2::subscriptsuperscript𝐺subscript𝑣2pasubscript𝑣2absentG^{v_{2}}_{\mathop{\rm pa}\nolimits(v_{2})}:italic_G start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT :sv2subscript𝑠subscript𝑣2s_{v_{2}}italic_s start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPTv1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTtv2subscript𝑡subscript𝑣2t_{v_{2}}italic_t start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT0Gpa(v4)v4::subscriptsuperscript𝐺subscript𝑣4pasubscript𝑣4absentG^{v_{4}}_{\mathop{\rm pa}\nolimits(v_{4})}:italic_G start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT :sv4subscript𝑠subscript𝑣4s_{v_{4}}italic_s start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPTv1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTv3subscript𝑣3v_{3}italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTtv4subscript𝑡subscript𝑣4t_{v_{4}}italic_t start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT1  1010  11
Figure 4: Two maximum flow problems corresponding to the ADMG of Fig. 2.

5 The Genericity Condition for the Error Distribution

The idea underlying 1 is that it should not be possible to linearly disentangle a general dependence between two errors εusubscript𝜀𝑢\varepsilon_{u}italic_ε start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT and εvsubscript𝜀𝑣\varepsilon_{v}italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT. In other words, if two different linear combinations of ε𝜀\varepsilonitalic_ε are independent, then at least one of them cannot have any signal coming from (εu,εv)subscript𝜀𝑢subscript𝜀𝑣(\varepsilon_{u},\varepsilon_{v})( italic_ε start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT , italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ). The purpose of this section is to prove that this fact is indeed true for two tractable subfamilies of joint distributions for the errors. Specifically, Section 5.1 considers the setting in which dependence is generated through linear latent factor models, and Section 5.2 treats distributions with finite moments.

5.1 Linear Factor Models

Assume that the error vector ε𝜀\varepsilonitalic_ε is generated according to a sparse factor model that respects the Markov property of the bidirected part 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT of a given ADMG 𝒢𝒢\mathcal{G}caligraphic_G. Define a latent factor graph for 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT to be any DAG =(VL,E)𝑉𝐿subscript𝐸\mathcal{L}=(V\cup L,E_{\mathcal{L}})caligraphic_L = ( italic_V ∪ italic_L , italic_E start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT ), in which the latent nodes L𝐿Litalic_L are source nodes and whose latent projection (see Verma and Pearl (1990, Sec. 3)) on the nodes in V𝑉Vitalic_V is equal to 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT. Define (k)𝑘\mathcal{M}(k)caligraphic_M ( italic_k ) to be the set of k𝑘kitalic_k-dimensional random vectors with independent and non-Gaussian components. Then, the sparse factor model associated to \mathcal{L}caligraphic_L is the set of random vectors

(𝒢B)={ε(𝒢B):η(|V|+|L|),H,ε=HL,VTηL+ηV}.superscriptsubscript𝒢𝐵conditional-set𝜀subscript𝒢𝐵formulae-sequence𝜂𝑉𝐿formulae-sequence𝐻superscript𝜀superscriptsubscript𝐻𝐿𝑉𝑇subscript𝜂𝐿subscript𝜂𝑉\mathcal{M}^{\mathcal{L}}(\mathcal{G}_{B})=\{\varepsilon\in\mathcal{M}(% \mathcal{G}_{B})\>:\>\exists\,\eta\in\mathcal{M}(|V|+|L|),\,\,H\in\mathbb{R}^{% \mathcal{L}},\,\,\varepsilon=H_{L,V}^{T}\cdot\eta_{L}+\eta_{V}\}.caligraphic_M start_POSTSUPERSCRIPT caligraphic_L end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) = { italic_ε ∈ caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) : ∃ italic_η ∈ caligraphic_M ( | italic_V | + | italic_L | ) , italic_H ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_L end_POSTSUPERSCRIPT , italic_ε = italic_H start_POSTSUBSCRIPT italic_L , italic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ italic_η start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT + italic_η start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT } . (5.1)
Theorem 5.1.

Let =(VL,E)𝑉𝐿subscript𝐸\mathcal{L}=(V\cup L,E_{\mathcal{L}})caligraphic_L = ( italic_V ∪ italic_L , italic_E start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT ) be a latent factor graph for 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT, and for any subset CV𝐶𝑉C\subset Vitalic_C ⊂ italic_V define LC:={lL:ch(l)C}assignsubscript𝐿𝐶conditional-set𝑙𝐿subscriptch𝑙𝐶L_{C}:=\{l\in L\>:\>\mathop{\rm ch}\nolimits_{\mathcal{L}}(l)\subseteq C\}italic_L start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT := { italic_l ∈ italic_L : roman_ch start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT ( italic_l ) ⊆ italic_C }. If for every edge uv𝒢Babsent𝑢𝑣subscript𝒢𝐵u\xleftrightarrow{}v\in\mathcal{G}_{B}italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT there is a clique Cuvsubscript𝐶𝑢𝑣C_{uv}italic_C start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT (a subset of V𝑉Vitalic_V for which every pair of nodes is adjacent) in 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT such that |LCuv||Cuv|1subscript𝐿subscript𝐶𝑢𝑣subscript𝐶𝑢𝑣1|L_{C_{uv}}|\geq|C_{uv}|-1| italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT | ≥ | italic_C start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT | - 1 then ε𝜀\varepsilonitalic_ε satisfies 1 for Lebesgue-almost every matrix H𝐻superscriptH\in\mathbb{R}^{\mathcal{L}}italic_H ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_L end_POSTSUPERSCRIPT.

Proof.

Let a1,a2Vsubscript𝑎1subscript𝑎2superscript𝑉a_{1},a_{2}\in\mathbb{R}^{V}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_V end_POSTSUPERSCRIPT, and consider ε=HL,VTηL+ηV𝜀superscriptsubscript𝐻𝐿𝑉𝑇subscript𝜂𝐿subscript𝜂𝑉\varepsilon=H_{L,V}^{T}\cdot\eta_{L}+\eta_{V}italic_ε = italic_H start_POSTSUBSCRIPT italic_L , italic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ italic_η start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT + italic_η start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT as in Eq. 5.1. Applying the Darmois-Skitovich theorem (Comon and Jutten, 2010, Thm. 9.5) to a1Tεsuperscriptsubscript𝑎1𝑇𝜀a_{1}^{T}\varepsilonitalic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_ε and a2Tεsuperscriptsubscript𝑎2𝑇𝜀a_{2}^{T}\varepsilonitalic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_ε, we obtain that

a1sa2ssubscript𝑎1𝑠subscript𝑎2𝑠\displaystyle a_{1s}\cdot a_{2s}italic_a start_POSTSUBSCRIPT 1 italic_s end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_s end_POSTSUBSCRIPT =0,absent0\displaystyle=0,\;= 0 , sV,for-all𝑠𝑉\displaystyle\forall s\in V,∀ italic_s ∈ italic_V , (5.2)
(a1THL,VT)l(a2THL,VT)lsubscriptsuperscriptsubscript𝑎1𝑇superscriptsubscript𝐻𝐿𝑉𝑇𝑙subscriptsuperscriptsubscript𝑎2𝑇superscriptsubscript𝐻𝐿𝑉𝑇𝑙\displaystyle(a_{1}^{T}H_{L,V}^{T})_{l}\cdot(a_{2}^{T}H_{L,V}^{T})_{l}( italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H start_POSTSUBSCRIPT italic_L , italic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ⋅ ( italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H start_POSTSUBSCRIPT italic_L , italic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT =0,absent0\displaystyle=0,\;= 0 , lL.for-all𝑙𝐿\displaystyle\forall l\in L.∀ italic_l ∈ italic_L . (5.3)

Note that Eq. 5.2 already gives the part of the claim referring to the case u=v𝑢𝑣u=vitalic_u = italic_v in 1. It remains to consider the case of two nodes u,v𝑢𝑣u,vitalic_u , italic_v that are adjacent in 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT.

Let uv𝒢Babsent𝑢𝑣subscript𝒢𝐵u\xleftrightarrow{}v\in\mathcal{G}_{B}italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT, and assume for contradiction that a1ua2v0subscript𝑎1𝑢subscript𝑎2𝑣0a_{1u}\cdot a_{2v}\neq 0italic_a start_POSTSUBSCRIPT 1 italic_u end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_v end_POSTSUBSCRIPT ≠ 0. Consider a clique Cuvsubscript𝐶𝑢𝑣C_{uv}italic_C start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT as in the statement of the theorem. The vector aCu,v:=(a1,Cu,v,a2,Cu,v)assignsubscript𝑎subscript𝐶𝑢𝑣subscript𝑎1subscript𝐶𝑢𝑣subscript𝑎2subscript𝐶𝑢𝑣a_{C_{u,v}}:=(a_{1,C_{u,v}},a_{2,C_{u,v}})italic_a start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT := ( italic_a start_POSTSUBSCRIPT 1 , italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 2 , italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) is a solution of the following system of quadratic equations:

(cCuva1cHcl)(cCuva2cHcl)=0,lLCu,v;formulae-sequencesubscript𝑐subscript𝐶𝑢𝑣subscript𝑎1𝑐subscript𝐻𝑐𝑙subscript𝑐subscript𝐶𝑢𝑣subscript𝑎2𝑐subscript𝐻𝑐𝑙0𝑙subscript𝐿subscript𝐶𝑢𝑣\left(\sum_{c\in C_{uv}}a_{1c}H_{cl}\right)\cdot\left(\sum_{c\in C_{uv}}a_{2c}% H_{cl}\right)=0,\,\quad l\in L_{C_{u,v}};( ∑ start_POSTSUBSCRIPT italic_c ∈ italic_C start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 italic_c end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_c italic_l end_POSTSUBSCRIPT ) ⋅ ( ∑ start_POSTSUBSCRIPT italic_c ∈ italic_C start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 2 italic_c end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_c italic_l end_POSTSUBSCRIPT ) = 0 , italic_l ∈ italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ;

we denote the system by 𝒮Cu,vsubscript𝒮subscript𝐶𝑢𝑣\mathcal{S}_{C_{u,v}}caligraphic_S start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT. Notice that from Eq. 5.2 we know that the vector aCu,vsubscript𝑎subscript𝐶𝑢𝑣a_{C_{u,v}}italic_a start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT has at most |Cu,v|subscript𝐶𝑢𝑣|C_{u,v}|| italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT | non-zero entries. We now show that, for a generic choice of the entries of H𝐻Hitalic_H, 𝒮Cu,vsubscript𝒮subscript𝐶𝑢𝑣\mathcal{S}_{C_{u,v}}caligraphic_S start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT does not admit solutions with a1ua2v0subscript𝑎1𝑢subscript𝑎2𝑣0a_{1u}\cdot a_{2v}\neq 0italic_a start_POSTSUBSCRIPT 1 italic_u end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_v end_POSTSUBSCRIPT ≠ 0. Following the case distinctions resulting from the vanishing of the first or the second factor in the equations in (5.2), the solution set of 𝒮Cu,vsubscript𝒮subscript𝐶𝑢𝑣\mathcal{S}_{C_{u,v}}caligraphic_S start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT can be written as the union of the solution set of 2|LCu,v|superscript2subscript𝐿subscript𝐶𝑢𝑣2^{|L_{C_{u,v}}|}2 start_POSTSUPERSCRIPT | italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT | end_POSTSUPERSCRIPT homogeneous linear systems. Each of these linear systems can be characterized by a partition of LCu,vsubscript𝐿subscript𝐶𝑢𝑣L_{C_{u,v}}italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT defined as follows:

L1:={lLCu,v:(a1THL,VT)l=0},L2:=LCu,vL1.formulae-sequenceassignsubscript𝐿1conditional-set𝑙subscript𝐿subscript𝐶𝑢𝑣subscriptsuperscriptsubscript𝑎1𝑇superscriptsubscript𝐻𝐿𝑉𝑇𝑙0assignsubscript𝐿2subscript𝐿subscript𝐶𝑢𝑣subscript𝐿1L_{1}:=\{l\in L_{C_{u,v}}\>:\>(a_{1}^{T}H_{L,V}^{T})_{l}=0\},\,\quad L_{2}:=L_% {C_{u,v}}\setminus L_{1}.italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT := { italic_l ∈ italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT : ( italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H start_POSTSUBSCRIPT italic_L , italic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT = 0 } , italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT := italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∖ italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT .

We denote by 𝒮1subscript𝒮1\mathcal{S}_{1}caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 𝒮2subscript𝒮2\mathcal{S}_{2}caligraphic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT the linear systems associated to L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, respectively. Define V1={vCu,v:a1v=0}subscript𝑉1conditional-set𝑣subscript𝐶𝑢𝑣subscript𝑎1𝑣0V_{1}=\{v\in C_{u,v}\>:\>a_{1v}=0\}italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = { italic_v ∈ italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT : italic_a start_POSTSUBSCRIPT 1 italic_v end_POSTSUBSCRIPT = 0 } and V2={vCu,v:a2v=0}subscript𝑉2conditional-set𝑣subscript𝐶𝑢𝑣subscript𝑎2𝑣0V_{2}=\{v\in C_{u,v}\>:\>a_{2v}=0\}italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = { italic_v ∈ italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT : italic_a start_POSTSUBSCRIPT 2 italic_v end_POSTSUBSCRIPT = 0 }.

If V1V2subscript𝑉1subscript𝑉2V_{1}\cap V_{2}\neq\emptysetitalic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≠ ∅, the vector aCu,vsubscript𝑎subscript𝐶𝑢𝑣a_{C_{u,v}}italic_a start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT has at most |Cu,v|1subscript𝐶𝑢𝑣1|C_{u,v}|-1| italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT | - 1 non-zero entries, implying that 𝒮1𝒮2subscript𝒮1subscript𝒮2\mathcal{S}_{1}\cup\mathcal{S}_{2}caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ caligraphic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT has |LCu,v|subscript𝐿subscript𝐶𝑢𝑣|L_{C_{u,v}}|| italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT | equation and |Cu,v|1subscript𝐶𝑢𝑣1|C_{u,v}|-1| italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT | - 1 parameters. If |LCu,v||Cu,v|1subscript𝐿subscript𝐶𝑢𝑣subscript𝐶𝑢𝑣1|L_{C_{u,v}}|\geq|C_{u,v}|-1| italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT | ≥ | italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT | - 1, for a generic choice of the entries of H𝐻Hitalic_H, such a system admits only the 0 solution (Okamoto, 1973, Lemma). Hence, the assumption that a1ua2v0subscript𝑎1𝑢subscript𝑎2𝑣0a_{1u}\cdot a_{2v}\neq 0italic_a start_POSTSUBSCRIPT 1 italic_u end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_v end_POSTSUBSCRIPT ≠ 0 leads to a contradiction.

For V1V2=subscript𝑉1subscript𝑉2V_{1}\cap V_{2}=\emptysetitalic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ∅, we now show that either 𝒮1subscript𝒮1\mathcal{S}_{1}caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT oder 𝒮2subscript𝒮2\mathcal{S}_{2}caligraphic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT admits only the 0 solution. Notice that since a1ua2v0subscript𝑎1𝑢subscript𝑎2𝑣0a_{1u}\cdot a_{2v}\neq 0italic_a start_POSTSUBSCRIPT 1 italic_u end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_v end_POSTSUBSCRIPT ≠ 0 we have V1,V2subscript𝑉1subscript𝑉2V_{1},V_{2}\neq\emptysetitalic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≠ ∅. This implies that both 𝒮1subscript𝒮1\mathcal{S}_{1}caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 𝒮2subscript𝒮2\mathcal{S}_{2}caligraphic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT can have a non-zero solution for a generic choice of the entries of H𝐻Hitalic_H only if

|L1||V1|1,|L2||V2|1.formulae-sequencesubscript𝐿1subscript𝑉11subscript𝐿2subscript𝑉21|L_{1}|\leq|V_{1}|-1,\,\quad|L_{2}|\leq|V_{2}|-1.| italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | ≤ | italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | - 1 , | italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | ≤ | italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | - 1 .

This would lead to |LCu,v|=|L1|+|L2||V1|+|V2|2=|Cu,v|2subscript𝐿subscript𝐶𝑢𝑣subscript𝐿1subscript𝐿2subscript𝑉1subscript𝑉22subscript𝐶𝑢𝑣2|L_{C_{u,v}}|=|L_{1}|+|L_{2}|\leq|V_{1}|+|V_{2}|-2=|C_{u,v}|-2| italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT | = | italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | + | italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | ≤ | italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | + | italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | - 2 = | italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT | - 2, which contradicts |LCu,v||Cu,v|1subscript𝐿subscript𝐶𝑢𝑣subscript𝐶𝑢𝑣1|L_{C_{u,v}}|\geq|C_{u,v}|-1| italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT | ≥ | italic_C start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT | - 1. ∎

Corollary 5.2.

Let 𝒟(𝒢B)𝒟subscript𝒢𝐵\mathcal{D}(\mathcal{G}_{B})caligraphic_D ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), be the canonical DAG associated to 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT, Richardson and Spirtes (2002, §6), then 1 is satisfied for a generic choice of parameters of 𝒟(𝒢B)(𝒢B)superscript𝒟subscript𝒢𝐵subscript𝒢𝐵\mathcal{M}^{\mathcal{D}(\mathcal{G}_{B})}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT caligraphic_D ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ).

Remark 5.1.

We point out that in the non-Gaussian setting, sparse linear factor analysis models have testable implications, see, e.g., Ardiyansyah and Sodomaco (2023); Xie et al. (2023); Schkoda and Drton (2023). Hence, the failure of 1 could, in principle, be tested by testing all linear factor models leading to the failure.

Example 5.1.

We borrow the example in Fig. 5 from Barber et al. (2022, Fig. 5). Notice that in the proof of our main result, 1 is used only for matrices with a specific structure, described in Lemma 3.1. Therefore, we focus on this type of matrices. In particular, we will consider the matrix

A=(a1Ta2T)=(1000000a53a541),𝐴matrixsuperscriptsubscript𝑎1𝑇superscriptsubscript𝑎2𝑇matrix1000000subscript𝑎53subscript𝑎541A=\begin{pmatrix}a_{1}^{T}\\ a_{2}^{T}\end{pmatrix}=\begin{pmatrix}1&0&0&0&0\\ 0&0&a_{53}&a_{54}&1\end{pmatrix},italic_A = ( start_ARG start_ROW start_CELL italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_a start_POSTSUBSCRIPT 53 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT 54 end_POSTSUBSCRIPT end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) , (5.4)

and the bidirected graph 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT, corresponding to 𝒢𝒢\mathcal{G}caligraphic_G in Fig. 5 with respect to the latent factor models ,1,2subscript1subscript2\mathcal{L},\mathcal{L}_{1},\mathcal{L}_{2}caligraphic_L , caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , caligraphic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT given in Fig. 5, and Fig. 6.

  1. 1.

    Consider the pair, v2v3𝒢Babsentsubscript𝑣2subscript𝑣3subscript𝒢𝐵v_{2}\xleftrightarrow{}v_{3}\in\mathcal{G}_{B}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT, the only latent parent of both in \mathcal{L}caligraphic_L is l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and ch(l1)={v1,v2,v3,v4}chsubscript𝑙1subscript𝑣1subscript𝑣2subscript𝑣3subscript𝑣4\mathop{\rm ch}\nolimits(l_{1})=\{v_{1},v_{2},v_{3},v_{4}\}roman_ch ( italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT }. This means that the only clique we can consider is Cv2v3={v1,v2,v3,v4}subscript𝐶subscript𝑣2subscript𝑣3subscript𝑣1subscript𝑣2subscript𝑣3subscript𝑣4C_{v_{2}v_{3}}=\{v_{1},v_{2},v_{3},v_{4}\}italic_C start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT } and |LCv2v3|=1subscript𝐿subscript𝐶subscript𝑣2subscript𝑣31|L_{C_{v_{2}v_{3}}}|=1| italic_L start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT | = 1, hence the condition in Theorem 5.1 is violated. Now, we will show that Eq. 5.3 has a nonzero solution. Indeed, the only latent variable for which the system is not trivially satisfied is l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, implying that any solution of the equation a53H3l1+a54H4l1=0subscript𝑎53subscript𝐻3subscript𝑙1subscript𝑎54subscript𝐻4subscript𝑙10a_{53}H_{3l_{1}}+a_{54}H_{4l_{1}}=0italic_a start_POSTSUBSCRIPT 53 end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 3 italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_a start_POSTSUBSCRIPT 54 end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 4 italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0, is also a solution of Eq. 5.3.

    𝒢::𝒢absent\mathcal{G}:caligraphic_G :v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTv3subscript𝑣3v_{3}italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTv4subscript𝑣4v_{4}italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPTv5subscript𝑣5v_{5}italic_v start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPTl1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTl2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTl3subscript𝑙3l_{3}italic_l start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT::absent\mathcal{L}:caligraphic_L :v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTv3subscript𝑣3v_{3}italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTv4subscript𝑣4v_{4}italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPTv5subscript𝑣5v_{5}italic_v start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT
    Figure 5: A latent factor model under which 1 does not hold and the corresponding latent projection.
  2. 2.

    It is straightforward to see that after adding a latent node l4subscript𝑙4l_{4}italic_l start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT to the graph as in the graph 1subscript1\mathcal{L}_{1}caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in Fig. 6, the condition of Theorem 5.1 is still not satisfied. However, in this case, 1 cannot be violated by the matrices described in Eq. 5.4. To see this, consider Eq. 5.3 for the latent variables l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and l4subscript𝑙4l_{4}italic_l start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, which leads to the following system of equations for (a53,a54)subscript𝑎53subscript𝑎54(a_{53},a_{54})( italic_a start_POSTSUBSCRIPT 53 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 54 end_POSTSUBSCRIPT ).

    {H1l1(a53H3l1+a54H4l1)=0H1l4(a53H3l1)=0,casessubscript𝐻1subscript𝑙1subscript𝑎53subscript𝐻3subscript𝑙1subscript𝑎54subscript𝐻4subscript𝑙1absent0subscript𝐻1subscript𝑙4subscript𝑎53subscript𝐻3subscript𝑙1absent0\begin{cases}H_{1l_{1}}(a_{53}H_{3l_{1}}+a_{54}H_{4l_{1}})&=0\\ H_{1l_{4}}(a_{53}H_{3l_{1}})&=0,\end{cases}{ start_ROW start_CELL italic_H start_POSTSUBSCRIPT 1 italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_a start_POSTSUBSCRIPT 53 end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 3 italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_a start_POSTSUBSCRIPT 54 end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 4 italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) end_CELL start_CELL = 0 end_CELL end_ROW start_ROW start_CELL italic_H start_POSTSUBSCRIPT 1 italic_l start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_a start_POSTSUBSCRIPT 53 end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 3 italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) end_CELL start_CELL = 0 , end_CELL end_ROW

    Clearly, the only solution to this system of equation is (a53,a54)=(0,0)subscript𝑎53subscript𝑎5400(a_{53},a_{54})=(0,0)( italic_a start_POSTSUBSCRIPT 53 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 54 end_POSTSUBSCRIPT ) = ( 0 , 0 ).

    1::subscript1absent\mathcal{L}_{1}:caligraphic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT :v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTv3subscript𝑣3v_{3}italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTv4subscript𝑣4v_{4}italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPTv5subscript𝑣5v_{5}italic_v start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPTl1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTl2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTl3subscript𝑙3l_{3}italic_l start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTl4subscript𝑙4l_{4}italic_l start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT2::subscript2absent\mathcal{L}_{2}:caligraphic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT :v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTv3subscript𝑣3v_{3}italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTv4subscript𝑣4v_{4}italic_v start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPTv5subscript𝑣5v_{5}italic_v start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPTl1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTl2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTl3subscript𝑙3l_{3}italic_l start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPTl4subscript𝑙4l_{4}italic_l start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPTl5subscript𝑙5l_{5}italic_l start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT
    Figure 6: Two latent factor models with the same latent projection as in Fig. 5, under which 1 holds generically, for matrices as in Eq. 5.4. The graph on the left does not satisfy the condition of Theorem 5.1 while the right graph satisfies the condition. Hence, the condition introduced in Theorem 5.1 is sufficient but not necessary.
  3. 3.

    The graph 2subscript2\mathcal{L}_{2}caligraphic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT satisfies the hypothesis of Theorem 5.1; hence it does not violate 1.

5.2 Random Variables with Finite Moments

We now turn to a setting where the error vector has finite moments up to a suitable order. As we show in Theorem 5.5 below, the distributions at which 1 fails define a set of moments, or also cumulants, that form a Lebesgue null set in all possible moments/cumulants up to the considered truncation order. The proofs for the results presented in this section can be found in Section B.2.

Definition 5.1.

The k𝑘kitalic_k-th cumulant tensor of a random vector ε=(ε1,,εp)𝜀subscript𝜀1subscript𝜀𝑝\varepsilon=(\varepsilon_{1},\dots,\varepsilon_{p})italic_ε = ( italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ε start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) is the k𝑘kitalic_k-way tensor in p××p(p)ksuperscript𝑝𝑝superscriptsuperscript𝑝𝑘\mathbb{R}^{p\times\dots\times p}\equiv(\mathbb{R}^{p})^{k}blackboard_R start_POSTSUPERSCRIPT italic_p × ⋯ × italic_p end_POSTSUPERSCRIPT ≡ ( blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT whose entry in position (i1,,ik)subscript𝑖1subscript𝑖𝑘(i_{1},\dots,i_{k})( italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is the joint cumulant

𝒞(k)(ε)i1,,ik:=(A1,,AL)(1)L1(L1)!𝔼[jA1εj]𝔼[jALεj],assignsuperscript𝒞𝑘subscript𝜀subscript𝑖1subscript𝑖𝑘subscriptsubscript𝐴1subscript𝐴𝐿superscript1𝐿1𝐿1𝔼delimited-[]subscriptproduct𝑗subscript𝐴1subscript𝜀𝑗𝔼delimited-[]subscriptproduct𝑗subscript𝐴𝐿subscript𝜀𝑗\displaystyle\mathcal{C}^{(k)}(\varepsilon)_{i_{1},\dots,i_{k}}:=\sum_{(A_{1},% \dots,A_{L})}(-1)^{L-1}(L-1)!\mathbb{E}\bigg{[}\prod_{j\in A_{1}}\varepsilon_{% j}\bigg{]}\cdots\mathbb{E}\bigg{[}\prod_{j\in A_{L}}\varepsilon_{j}\bigg{]},caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT := ∑ start_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( - 1 ) start_POSTSUPERSCRIPT italic_L - 1 end_POSTSUPERSCRIPT ( italic_L - 1 ) ! blackboard_E [ ∏ start_POSTSUBSCRIPT italic_j ∈ italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_ε start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ] ⋯ blackboard_E [ ∏ start_POSTSUBSCRIPT italic_j ∈ italic_A start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_ε start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ] ,

where the sum is taken over all partitions (A1,,AL)subscript𝐴1subscript𝐴𝐿(A_{1},\dots,A_{L})( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) of the multiset {i1,,ik}subscript𝑖1subscript𝑖𝑘\{i_{1},\dots,i_{k}\}{ italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT }.

Cumulant tensors are symmetric, i.e.,

𝒞(k)(ε)i1,,ik=𝒞(k)(ε)σ(i1),,σ(ik)σSk,superscript𝒞𝑘subscript𝜀subscript𝑖1subscript𝑖𝑘superscript𝒞𝑘subscript𝜀𝜎subscript𝑖1𝜎subscript𝑖𝑘for-all𝜎subscript𝑆𝑘\mathcal{C}^{(k)}(\varepsilon)_{i_{1},\dots,i_{k}}=\mathcal{C}^{(k)}(% \varepsilon)_{\sigma(i_{1}),\dots,\sigma(i_{k})}\ \forall\sigma\in S_{k},caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_σ ( italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , italic_σ ( italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ∀ italic_σ ∈ italic_S start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,

where Sksubscript𝑆𝑘S_{k}italic_S start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is the symmetric group on [k]={1,,k}delimited-[]𝑘1𝑘[k]=\{1,\dots,k\}[ italic_k ] = { 1 , … , italic_k }. We write Symk(p)subscriptSym𝑘𝑝\operatorname{Sym}_{k}(p)roman_Sym start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_p ) for the subspace of symmetric tensors in (p)ksuperscriptsuperscript𝑝𝑘(\mathbb{R}^{p})^{k}( blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT.

1 involves linear combinations of the entries of a random vector. We will, thus, have to consider cumulants after linear transformation, for which we can leverage the following fact.

Lemma 5.3 (Comon and Jutten (2010), §5, Eq. 5.8).

Let ε=(ε1,,εp)𝜀subscript𝜀1subscript𝜀𝑝\varepsilon=(\varepsilon_{1},\dots,\varepsilon_{p})italic_ε = ( italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ε start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) be any p𝑝pitalic_p-variate random vector, and As×p𝐴superscript𝑠𝑝A\in\mathbb{R}^{s\times p}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_s × italic_p end_POSTSUPERSCRIPT for any s𝑠s\in\mathbb{N}italic_s ∈ blackboard_N, then

𝒞(k)(Aε)i1,,ik=j1,,jk𝒞(k)(ε)j1,,jkaj1iiajkik.superscript𝒞𝑘subscript𝐴𝜀subscript𝑖1subscript𝑖𝑘subscriptsubscript𝑗1subscript𝑗𝑘superscript𝒞𝑘subscript𝜀subscript𝑗1subscript𝑗𝑘subscript𝑎subscript𝑗1subscript𝑖𝑖subscript𝑎subscript𝑗𝑘subscript𝑖𝑘\displaystyle\mathcal{C}^{(k)}(A\cdot\varepsilon)_{i_{1},\dots,i_{k}}=\sum_{j_% {1},\dots,j_{k}}\mathcal{C}^{(k)}(\varepsilon)_{j_{1},\dots,j_{k}}a_{j_{1}i_{i% }}\cdots a_{j_{k}i_{k}}.caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_A ⋅ italic_ε ) start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋯ italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

In order to justify 1 we wish to offer statements of its generic validity. Our strategy to do so in the present context is to consider cumulants up to a suitable truncation order k𝑘kitalic_k. In the remainder of this section we consider a mixed graph 𝒢𝒢\mathcal{G}caligraphic_G with p𝑝pitalic_p nodes, which we label by taking the vertex set to be V=[p]𝑉delimited-[]𝑝V=[p]italic_V = [ italic_p ].

Definition 5.2.

Let (𝒢B)subscriptsubscript𝒢𝐵\mathcal{M}_{\infty}(\mathcal{G}_{B})caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) be the subset of (𝒢B)subscript𝒢𝐵\mathcal{M}(\mathcal{G}_{B})caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) yielding distributions with all moments finite. For any integer k2𝑘2k\geq 2italic_k ≥ 2, let

(k)(𝒢B)={𝒞(k)Symk(p):𝒞i1,,ik(k)=0 if {i1,,ik} is not connected in 𝒢B}.superscript𝑘subscript𝒢𝐵conditional-setsuperscript𝒞𝑘subscriptSym𝑘superscript𝑝subscriptsuperscript𝒞𝑘subscript𝑖1subscript𝑖𝑘0 if subscript𝑖1subscript𝑖𝑘 is not connected in subscript𝒢𝐵\mathcal{M}^{(k)}(\mathcal{G}_{B})=\left\{\mathcal{C}^{(k)}\in\operatorname{% Sym}_{k}(\mathbb{R}^{p})\>:\>\mathcal{C}^{(k)}_{i_{1},\dots,i_{k}}=0\emph{ if % }\{i_{1},\dots,i_{k}\}\emph{ is not connected in }\mathcal{G}_{B}\;\right\}.caligraphic_M start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) = { caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ∈ roman_Sym start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ) : caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0 if { italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } is not connected in caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT } .

Moreover, we let

k(𝒢B)=(2)(𝒢B)××(k)(𝒢B).superscriptabsent𝑘subscript𝒢𝐵superscript2subscript𝒢𝐵superscript𝑘subscript𝒢𝐵\mathcal{M}^{\leq k}(\mathcal{G}_{B})=\mathcal{M}^{(2)}(\mathcal{G}_{B})\times% \cdots\times\mathcal{M}^{(k)}(\mathcal{G}_{B}).caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) = caligraphic_M start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) × ⋯ × caligraphic_M start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) .
Lemma 5.4.

Fix any integer k1𝑘1k\geq 1italic_k ≥ 1.

  • (i)

    The map ϕk:(𝒢B)(k)(𝒢B):superscriptitalic-ϕ𝑘subscriptsubscript𝒢𝐵superscript𝑘subscript𝒢𝐵\phi^{k}:\mathcal{M}_{\infty}(\mathcal{G}_{B})\to\mathcal{M}^{(k)}(\mathcal{G}% _{B})italic_ϕ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT : caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) → caligraphic_M start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) that sends random vectors with all moments finite to their k𝑘kitalic_k-th cumulant tensors is well-defined in the sense that ϕk((𝒢B))(k)(𝒢B)superscriptitalic-ϕ𝑘subscriptsubscript𝒢𝐵superscript𝑘subscript𝒢𝐵\phi^{k}(\mathcal{M}_{\infty}(\mathcal{G}_{B}))\subseteq\mathcal{M}^{(k)}(% \mathcal{G}_{B})italic_ϕ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) ⊆ caligraphic_M start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ).

  • (ii)

    Define the map ϕk=(ϕl)lk:(𝒢B)k(𝒢B):superscriptitalic-ϕabsent𝑘subscriptsuperscriptitalic-ϕ𝑙𝑙𝑘subscriptsubscript𝒢𝐵superscriptabsent𝑘subscript𝒢𝐵\phi^{\leq k}=(\phi^{l})_{l\leq k}:\mathcal{M}_{\infty}(\mathcal{G}_{B})\to% \mathcal{M}^{\leq k}(\mathcal{G}_{B})italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT = ( italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_l ≤ italic_k end_POSTSUBSCRIPT : caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) → caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). Then ϕk((𝒢B))superscriptitalic-ϕabsent𝑘subscriptsubscript𝒢𝐵\phi^{\leq k}(\mathcal{M}_{\infty}(\mathcal{G}_{B}))italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) is a full dimensional subset of k(𝒢B)superscriptabsent𝑘subscript𝒢𝐵\mathcal{M}^{\leq k}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ).

Theorem 5.5.

Let

κ(𝒢B):={A=(aij)2×p:a1ia2j=0, if uiuj𝒢B oder i=j}.assign𝜅subscript𝒢𝐵conditional-set𝐴subscript𝑎𝑖𝑗superscript2𝑝absentsubscript𝑎1𝑖subscript𝑎2𝑗0 if subscript𝑢𝑖subscript𝑢𝑗subscript𝒢𝐵 oder 𝑖𝑗\kappa(\mathcal{G}_{B}):=\{A=(a_{ij})\in\mathbb{R}^{2\times p}\>:\>a_{1i}\cdot a% _{2j}=0,\emph{ if }u_{i}\xleftrightarrow{}u_{j}\in\mathcal{G}_{B}\emph{ or }i=% j\}.italic_κ ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) := { italic_A = ( italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT 2 × italic_p end_POSTSUPERSCRIPT : italic_a start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_j end_POSTSUBSCRIPT = 0 , if italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT or italic_i = italic_j } .

For every ε(𝒢B)𝜀subscriptsubscript𝒢𝐵\varepsilon\in\mathcal{M}_{\infty}(\mathcal{G}_{B})italic_ε ∈ caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), define κ(ε)={A2×p:(Aε)1(Aε)2}\kappa(\varepsilon)=\{A\in\mathbb{R}^{2\times p}\>:\>(A\varepsilon)_{1}\perp\!% \!\!\perp(A\varepsilon)_{2}\}italic_κ ( italic_ε ) = { italic_A ∈ blackboard_R start_POSTSUPERSCRIPT 2 × italic_p end_POSTSUPERSCRIPT : ( italic_A italic_ε ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟂ ⟂ ( italic_A italic_ε ) start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }, and let 𝒮(𝒢B)={ε(𝒢B):κ(ε)κ(𝒢B)0}𝒮subscript𝒢𝐵conditional-set𝜀subscriptsubscript𝒢𝐵𝜅𝜀𝜅subscript𝒢𝐵0\mathcal{S}(\mathcal{G}_{B})=\{\varepsilon\in\mathcal{M}_{\infty}(\mathcal{G}_% {B})\>:\>\kappa(\varepsilon)\setminus\kappa(\mathcal{G}_{B})\neq 0\}caligraphic_S ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) = { italic_ε ∈ caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) : italic_κ ( italic_ε ) ∖ italic_κ ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ≠ 0 }, which is precisely the set of distributions for which 1 fails. Then there is a positive integer k2(p+1)𝑘2𝑝1k\leq 2(p+1)italic_k ≤ 2 ( italic_p + 1 ) such that ϕk(𝒮(𝒢B))superscriptitalic-ϕabsent𝑘𝒮subscript𝒢𝐵\phi^{\leq k}(\mathcal{S}(\mathcal{G}_{B}))italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_S ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) is a Lebesgue measure 0 subset of k(𝒢B)superscriptabsent𝑘subscript𝒢𝐵\mathcal{M}^{\leq k}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ).

We remark that, for simplicity, we stated Theorem 5.5 for distributions with finite moments of any order. However, we only needed the first 2(p+1)2𝑝12(p+1)2 ( italic_p + 1 ) moments to be finite.

Example 5.2.

One simple type of exceptional distribution for which 1 fails to hold are distributions that are obtained linear transformations of independent non-Gaussian variables. For example, let U1,U2subscript𝑈1subscript𝑈2U_{1},U_{2}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT be two independent, standard univariate normal distributions, let Vi=Ui3subscript𝑉𝑖3subscript𝑈𝑖V_{i}=\sqrt[3]{U_{i}}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = nth-root start_ARG 3 end_ARG start_ARG italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG for i=1,2𝑖12i=1,2italic_i = 1 , 2, and let X=B(V1,V2)𝑋𝐵subscript𝑉1subscript𝑉2X=B\cdot(V_{1},V_{2})italic_X = italic_B ⋅ ( italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) for any invertible 2 by 2 matrix B𝐵Bitalic_B. Then X(𝒢B)𝑋subscriptsubscript𝒢𝐵X\in\mathcal{M}_{\infty}(\mathcal{G}_{B})italic_X ∈ caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) for 𝒢B={{1,2},{12}}\mathcal{G}_{B}=\{\{1,2\},\{1\xleftrightarrow{}2{}\}\}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT = { { 1 , 2 } , { 1 start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP 2 } }, but by construction V=B1X𝑉superscript𝐵1𝑋V=B^{-1}\cdot Xitalic_V = italic_B start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⋅ italic_X, and the fact that V1V2V_{1}\perp\!\!\!\perp V_{2}italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟂ ⟂ italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT implies X𝒮(𝒢B)𝑋𝒮subscript𝒢𝐵X\in\mathcal{S}(\mathcal{G}_{B})italic_X ∈ caligraphic_S ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). As noted in Schkoda and Drton (2023), linear transformations of independent components are forming a null set already when considering cumulants of order up to k3𝑘3k\leq 3italic_k ≤ 3.

Remark 5.2.

Theorem 5.5 is of independent interest given the recent scholarly attention to generalizations of ICA that can deal with dependent error terms; see, e.g., Mesters and Zwiernik (2022); Garrote-López and Stephenson (2024); Wang and Seigal (2024). Indeed, if we consider 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT to be the empty graph, then Theorem 5.5 reduces to a generic version of the classical Darmois-Skitovich theorem that underlies ICA theory (Comon and Jutten, 2010, Thm. 9.5). From this perspective, Theorem 5.5 provides a generic generalization of the Darmois-Skitovich theorem to the case where the independence structure of the sources is more complex. A consequence of Examples 5.1-5.2 is that a generalization of the Darmois-Skitovich theorem that holds globally, i.e., for every non-Gaussian distribution cannot be achieved.

6 Structure Identifiability

Up to this point, we have studied a parameter identifiability problem, assuming that the mixed graph 𝒢𝒢\mathcal{G}caligraphic_G is given. However, there are numerous practical problems in which one needs to infer the graph from data. This model selection problem is often referred to as structure learning oder causal discovery (Drton and Maathuis, 2017). Therefore, it is important to understand whether, within the model class under consideration, the graph is identifiable. If this is not the case, one is forced to either restrict the class of graphs under consideration, as done by Wang and Drton (2023), or to learn an equivalence class of graphs (Peters et al., 2017, §6). The problem that we consider in this section is as follows.

Problem (Model equivalence).

Given two ADMGs 𝒢𝒢\mathcal{G}caligraphic_G and 𝒢~~𝒢\tilde{\mathcal{G}}over~ start_ARG caligraphic_G end_ARG, is it true that (𝒢)=(𝒢~)𝒢=𝒢~iff𝒢~𝒢𝒢~𝒢\mathcal{M}(\mathcal{G})\!=\!\mathcal{M}(\tilde{\mathcal{G}})\!\iff\!\mathcal{% G}\!=\!\tilde{\mathcal{G}}caligraphic_M ( caligraphic_G ) = caligraphic_M ( over~ start_ARG caligraphic_G end_ARG ) ⇔ caligraphic_G = over~ start_ARG caligraphic_G end_ARG?

When (𝒢)=(𝒢~)𝒢~𝒢\mathcal{M}(\mathcal{G})=\mathcal{M}(\tilde{\mathcal{G}})caligraphic_M ( caligraphic_G ) = caligraphic_M ( over~ start_ARG caligraphic_G end_ARG ), we say that 𝒢𝒢\mathcal{G}caligraphic_G and 𝒢~~𝒢\tilde{\mathcal{G}}over~ start_ARG caligraphic_G end_ARG are model equivalent. The equivalence class of an ADMG 𝒢𝒢\mathcal{G}caligraphic_G is the set of all the ADMGs 𝒢~~𝒢\tilde{\mathcal{G}}over~ start_ARG caligraphic_G end_ARG that are model equivalent to it.

In the next result, we prove that the model equivalence of two arbitrary ADMGs can be certified by solving a system of quadratic equations. For graphs of small size, such a system can be solved with computer algebra software, in the same spirit as in García-Puente et al. (2010). In Example 6.1, we use this result to characterize the equivalence class of the IV graph depicted in Fig. 1.

Theorem 6.1.

Let 𝒢𝒢\mathcal{G}caligraphic_G and 𝒢~~𝒢\tilde{\mathcal{G}}over~ start_ARG caligraphic_G end_ARG be two arbitrary ADMGs with the same vertex set V𝑉Vitalic_V. Then (𝒢)(𝒢~)𝒢~𝒢\mathcal{M}(\mathcal{G})\subseteq\mathcal{M}(\tilde{\mathcal{G}})caligraphic_M ( caligraphic_G ) ⊆ caligraphic_M ( over~ start_ARG caligraphic_G end_ARG ) if and only if for every Λ𝒢Λsuperscript𝒢\Lambda\in\mathbb{R}^{\mathcal{G}}roman_Λ ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT, and every uv𝒢~absent𝑢𝑣~𝒢u\xleftrightarrow{}v\notin\tilde{\mathcal{G}}italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∉ over~ start_ARG caligraphic_G end_ARG the following system of equations has a solution in Λ~𝒢~~Λsuperscript~𝒢\tilde{\Lambda}\in\mathbb{R}^{\tilde{\mathcal{G}}}over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT over~ start_ARG caligraphic_G end_ARG end_POSTSUPERSCRIPT:

((IΛ~)TBΛ)uw1((IΛ~)TBΛ)vw2=0,w1=w2 and w1w2𝒢.absentformulae-sequencesubscriptsuperscript𝐼~Λ𝑇subscript𝐵Λ𝑢subscript𝑤1subscriptsuperscript𝐼~Λ𝑇subscript𝐵Λ𝑣subscript𝑤20for-allsubscript𝑤1subscript𝑤2 and subscript𝑤1subscript𝑤2𝒢((I-\tilde{\Lambda})^{T}B_{\Lambda})_{uw_{1}}\cdot((I-\tilde{\Lambda})^{T}B_{% \Lambda})_{vw_{2}}=0,\quad\forall w_{1}=w_{2}\text{ and }w_{1}\xleftrightarrow% {}w_{2}\in\mathcal{G}.( ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_u italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ ( ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0 , ∀ italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ caligraphic_G . (6.1)
Proof.

By definition, (𝒢)(𝒢~)𝒢~𝒢\mathcal{M}(\mathcal{G})\subseteq\mathcal{M}(\tilde{\mathcal{G}})caligraphic_M ( caligraphic_G ) ⊆ caligraphic_M ( over~ start_ARG caligraphic_G end_ARG ) if and only if for every X=(IΛ)Tε(𝒢)𝑋superscript𝐼Λ𝑇𝜀𝒢X=(I-\Lambda)^{-T}\varepsilon\in\mathcal{M}(\mathcal{G})italic_X = ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_ε ∈ caligraphic_M ( caligraphic_G ), we have X=d(IΛ~)Tε~superscript𝑑𝑋superscript𝐼~Λ𝑇~𝜀X\stackrel{{\scriptstyle d}}{{=}}(I-\tilde{\Lambda})^{-T}\tilde{\varepsilon}italic_X start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_ε end_ARG for some Λ~𝒢e~Λsuperscriptsuperscript𝒢𝑒\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}^{e}}over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT and ε~(𝒢~B)~𝜀subscript~𝒢𝐵\tilde{\varepsilon}\in\mathcal{M}(\tilde{\mathcal{G}}_{B})over~ start_ARG italic_ε end_ARG ∈ caligraphic_M ( over~ start_ARG caligraphic_G end_ARG start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). This implies

ε~=d(IΛ~)T(IΛ)Tε=(IΛ~)TBΛAε.superscript𝑑~𝜀superscript𝐼~Λ𝑇superscript𝐼Λ𝑇𝜀subscriptsuperscript𝐼~Λ𝑇subscript𝐵Λ𝐴𝜀\tilde{\varepsilon}\stackrel{{\scriptstyle d}}{{=}}(I-\tilde{\Lambda})^{T}(I-% \Lambda)^{-T}\varepsilon=\underbrace{(I-\tilde{\Lambda})^{T}B_{\Lambda}}_{A}\varepsilon.over~ start_ARG italic_ε end_ARG start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_ε = under⏟ start_ARG ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT italic_ε . (6.2)

In particular, Eq. 6.2 has to hold for the generic elements ε𝒢B𝜀subscript𝒢𝐵\varepsilon\in\mathcal{\mathcal{G}}_{B}italic_ε ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT satisfying 1. By applying 1 to every pair uv𝒢~absent𝑢𝑣~𝒢u\xleftrightarrow{}v\notin\tilde{\mathcal{G}}italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∉ over~ start_ARG caligraphic_G end_ARG, we conclude that the condition is necessary for the model inclusion. In order to prove its sufficiency, we must demonstrate that ε~~𝜀\tilde{\varepsilon}over~ start_ARG italic_ε end_ARG satisfies the connected set Markov property with respect to 𝒢𝒢\mathcal{G}caligraphic_G.

For every vV𝑣𝑉v\in Vitalic_v ∈ italic_V, let D(v)={uV:avu0}𝐷𝑣conditional-set𝑢𝑉subscript𝑎𝑣𝑢0D(v)=\{u\in V\>:\>a_{vu}\neq 0\}italic_D ( italic_v ) = { italic_u ∈ italic_V : italic_a start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT ≠ 0 }, and D(C):=vCD(v)assign𝐷𝐶subscript𝑣𝐶𝐷𝑣D(C):=\cup_{v\in C}D(v)italic_D ( italic_C ) := ∪ start_POSTSUBSCRIPT italic_v ∈ italic_C end_POSTSUBSCRIPT italic_D ( italic_v ) for CV𝐶𝑉C\subseteq Vitalic_C ⊆ italic_V. We need to show that ε~Cε~VSib(C)\tilde{\varepsilon}_{C}\perp\!\!\!\perp\tilde{\varepsilon}_{V\setminus\mathop{% \rm Sib}\nolimits(C)}over~ start_ARG italic_ε end_ARG start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ⟂ ⟂ over~ start_ARG italic_ε end_ARG start_POSTSUBSCRIPT italic_V ∖ roman_Sib ( italic_C ) end_POSTSUBSCRIPT for every CV𝐶𝑉C\subseteq Vitalic_C ⊆ italic_V that is connected in 𝒢~Bsubscript~𝒢𝐵\tilde{\mathcal{G}}_{B}over~ start_ARG caligraphic_G end_ARG start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT. If D(VSib(C))VSib(D(C))𝐷𝑉Sib𝐶𝑉Sib𝐷𝐶D(V\setminus\mathop{\rm Sib}\nolimits(C))\subseteq V\setminus\mathop{\rm Sib}% \nolimits(D(C))italic_D ( italic_V ∖ roman_Sib ( italic_C ) ) ⊆ italic_V ∖ roman_Sib ( italic_D ( italic_C ) ) in 𝒢𝒢\mathcal{G}caligraphic_G, then the result follows from the connected set Markov property of ε𝜀\varepsilonitalic_ε with respect to 𝒢𝒢\mathcal{G}caligraphic_G. Assume, for contradiction, that D(VSib(C))Sib(D(C))𝐷𝑉Sib𝐶Sib𝐷𝐶D(V\setminus\mathop{\rm Sib}\nolimits(C))\cap\mathop{\rm Sib}\nolimits(D(C))italic_D ( italic_V ∖ roman_Sib ( italic_C ) ) ∩ roman_Sib ( italic_D ( italic_C ) ), indicating that there exist w1D(VSib(C)),w2D(C),uVSib(C),formulae-sequencesubscript𝑤1𝐷𝑉Sib𝐶formulae-sequencesubscript𝑤2𝐷𝐶𝑢𝑉Sib𝐶w_{1}\in D(V\setminus\mathop{\rm Sib}\nolimits(C)),w_{2}\in D(C),u\in V% \setminus\mathop{\rm Sib}\nolimits(C),italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ italic_D ( italic_V ∖ roman_Sib ( italic_C ) ) , italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_D ( italic_C ) , italic_u ∈ italic_V ∖ roman_Sib ( italic_C ) , and vC𝑣𝐶v\in Citalic_v ∈ italic_C such that w1w2absentsubscript𝑤1subscript𝑤2w_{1}\xleftrightarrow{}w_{2}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in 𝒢𝒢\mathcal{G}caligraphic_G oder w1=w2subscript𝑤1subscript𝑤2w_{1}=w_{2}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and auw1avw20subscript𝑎𝑢subscript𝑤1subscript𝑎𝑣subscript𝑤20a_{uw_{1}}\cdot a_{vw_{2}}\neq 0italic_a start_POSTSUBSCRIPT italic_u italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≠ 0, which contradicts Eq. 6.1 since uv𝒢~absent𝑢𝑣~𝒢u\xleftrightarrow{}v\notin\tilde{\mathcal{G}}italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∉ over~ start_ARG caligraphic_G end_ARG. ∎

The next result applies Theorem 3.3 to graphically characterize when two ADMGs that only differ in one directed edge are model equivalent.

Theorem 6.2.

Let 𝒢=(V,E,E)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}})caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ) be an ADMG, e=u0v0𝒢𝑒subscript𝑢0subscript𝑣0𝒢e=u_{0}\to v_{0}\in\mathcal{G}italic_e = italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT → italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ caligraphic_G, and 𝒢e=(V,E{e},E)superscript𝒢𝑒𝑉subscript𝐸𝑒subscript𝐸\mathcal{G}^{e}=(V,E_{\rightarrow{}}\setminus\{e\},E_{\leftrightarrow{}})caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT ∖ { italic_e } , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT ). Then (𝒢e)=(𝒢)superscript𝒢𝑒𝒢\mathcal{M}(\mathcal{G}^{e})=\mathcal{M}(\mathcal{G})caligraphic_M ( caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) = caligraphic_M ( caligraphic_G ) if and only if λu0v0subscript𝜆subscript𝑢0subscript𝑣0\lambda_{u_{0}v_{0}}italic_λ start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is not identifiable in 𝒢𝒢\mathcal{G}caligraphic_G.

Proof.

Since 𝒢esuperscript𝒢𝑒\mathcal{G}^{e}caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT is a subgraph of 𝒢𝒢\mathcal{G}caligraphic_G, we always have (𝒢e)(𝒢)superscript𝒢𝑒𝒢\mathcal{M}(\mathcal{G}^{e})\subseteq\mathcal{M}(\mathcal{G})caligraphic_M ( caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) ⊆ caligraphic_M ( caligraphic_G ). Hence, we only need to show that (𝒢e)(𝒢)𝒢superscript𝒢𝑒\mathcal{M}(\mathcal{G}^{e})\supseteq\mathcal{M}(\mathcal{G})caligraphic_M ( caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) ⊇ caligraphic_M ( caligraphic_G ) if and only if λuvsubscript𝜆𝑢𝑣\lambda_{uv}italic_λ start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT is not identifiable in 𝒢𝒢\mathcal{G}caligraphic_G.

Let X=(IΛ)Tε(𝒢)𝑋superscript𝐼Λ𝑇𝜀𝒢X=(I-\Lambda)^{-T}\varepsilon\in\mathcal{M}(\mathcal{G})italic_X = ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_ε ∈ caligraphic_M ( caligraphic_G ). Then X(𝒢e)𝑋superscript𝒢𝑒X\in\mathcal{M}(\mathcal{G}^{e})italic_X ∈ caligraphic_M ( caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) if and only if X=d(IΛ~)Tε~superscript𝑑𝑋superscript𝐼~Λ𝑇~𝜀X\stackrel{{\scriptstyle d}}{{=}}(I-\tilde{\Lambda})^{-T}\tilde{\varepsilon}italic_X start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_ε end_ARG for Λ~𝒢e~Λsuperscriptsuperscript𝒢𝑒\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}^{e}}over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT and ε~(𝒢B)~𝜀subscript𝒢𝐵\tilde{\varepsilon}\in\mathcal{M}(\mathcal{G}_{B})over~ start_ARG italic_ε end_ARG ∈ caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). This implies

ε~=d(IΛ~)T(IΛ)Tε=(IΛ~)TBΛAε.superscript𝑑~𝜀superscript𝐼~Λ𝑇superscript𝐼Λ𝑇𝜀subscriptsuperscript𝐼~Λ𝑇subscript𝐵Λ𝐴𝜀\tilde{\varepsilon}\stackrel{{\scriptstyle d}}{{=}}(I-\tilde{\Lambda})^{T}(I-% \Lambda)^{-T}\varepsilon=\underbrace{(I-\tilde{\Lambda})^{T}B_{\Lambda}}_{A}\varepsilon.over~ start_ARG italic_ε end_ARG start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_ε = under⏟ start_ARG ( italic_I - over~ start_ARG roman_Λ end_ARG ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT italic_ε . (6.3)

Following the steps of the proof in Lemma 3.1, we get

auv=buvwpa(v)ede(v)λ~wubwv,subscript𝑎𝑢𝑣subscript𝑏𝑢𝑣subscript𝑤pasuperscript𝑣𝑒de𝑣subscript~𝜆𝑤𝑢subscript𝑏𝑤𝑣a_{uv}=b_{uv}-\sum_{w\in\\ \mathop{\rm pa}\nolimits(v)^{e}\cap\mathop{\rm de}\nolimits(v)}\tilde{\lambda}% _{wu}b_{wv},italic_a start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_w ∈ roman_pa ( italic_v ) start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ∩ roman_de ( italic_v ) end_POSTSUBSCRIPT over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_w italic_u end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_w italic_v end_POSTSUBSCRIPT ,

where pa(v)epasuperscript𝑣𝑒\mathop{\rm pa}\nolimits(v)^{e}roman_pa ( italic_v ) start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT is the parent set of v𝑣vitalic_v in 𝒢esuperscript𝒢𝑒\mathcal{G}^{e}caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT.

Since 𝒢𝒢\mathcal{G}caligraphic_G and 𝒢esuperscript𝒢𝑒\mathcal{G}^{e}caligraphic_G start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT have the same bidirected part, we can repeat the same proof as for Lemma 3.2, concluding that a matrix A𝐴Aitalic_A as in Eq. 6.3 can exist if and only if, for every vV𝑣𝑉v\in Vitalic_v ∈ italic_V, the following system has a solution:

[(BΛ)pa(v)e,Rv]Tλ~pa(v)e,v=[(BΛ)v,Rv]T.superscriptdelimited-[]subscriptsubscript𝐵Λpasuperscript𝑣𝑒subscript𝑅𝑣𝑇subscript~𝜆pasuperscript𝑣𝑒𝑣superscriptdelimited-[]subscriptsubscript𝐵Λ𝑣subscript𝑅𝑣𝑇[(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v)^{e},R_{v}}]^{T}\cdot\tilde{\lambda% }_{\mathop{\rm pa}\nolimits(v)^{e},v}=[(B_{\Lambda})_{v,R_{v}}]^{T}.[ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v ) start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT = [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . (6.4)

For vv0𝑣subscript𝑣0v\neq v_{0}italic_v ≠ italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, we have pa(v)e=pa(v)pasuperscript𝑣𝑒pa𝑣\mathop{\rm pa}\nolimits(v)^{e}=\mathop{\rm pa}\nolimits(v)roman_pa ( italic_v ) start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = roman_pa ( italic_v ). Hence, the system in Eq. 6.4 has always a solution given by λ~pa(v)e,v=λpa(v),vsubscript~𝜆pasuperscript𝑣𝑒𝑣subscript𝜆pa𝑣𝑣\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v)^{e},v}=\lambda_{\mathop{\rm pa}% \nolimits(v),v}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT = italic_λ start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT.

For v=v0𝑣subscript𝑣0v=v_{0}italic_v = italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, we have pa(v0)e=pa(v0){u0}pasuperscriptsubscript𝑣0𝑒pasubscript𝑣0subscript𝑢0\mathop{\rm pa}\nolimits(v_{0})^{e}=\mathop{\rm pa}\nolimits(v_{0})\setminus\{% u_{0}\}roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }. The system in Eq. 6.4 has a solution if and only if rank((BΛ)pa(v0){u0},Rv0)=rank((BΛ)pa(v0){u0}{v0},Rv0)ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑢0subscript𝑅subscript𝑣0ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑢0subscript𝑣0subscript𝑅subscript𝑣0\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{0})\setminus\{u% _{0}\},R_{v_{0}}})=\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits% (v_{0})\setminus\{u_{0}\}\cup\{v_{0}\},R_{v_{0}}})roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ∪ { italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ). Let Π=(π0,,πk)Πsubscript𝜋0subscript𝜋𝑘\Pi=(\pi_{0},\dots,\pi_{k})roman_Π = ( italic_π start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_π start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) be a system of paths without intersection from IRv0𝐼subscript𝑅subscript𝑣0I\subseteq R_{v_{0}}italic_I ⊆ italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT to Jpa(v0)𝐽pasubscript𝑣0J\subseteq\mathop{\rm pa}\nolimits(v_{0})italic_J ⊆ roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). If u0Πsubscript𝑢0Πu_{0}\in\Piitalic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ roman_Π, let π0subscript𝜋0\pi_{0}italic_π start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT be the path that ends at u0subscript𝑢0u_{0}italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, and π0superscriptsubscript𝜋0\pi_{0}^{*}italic_π start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT be the path obtained by concatenating π0subscript𝜋0\pi_{0}italic_π start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT with the edge u0v0subscript𝑢0subscript𝑣0u_{0}\to v_{0}italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT → italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. By construction, Π=(π0,,πk)superscriptΠsuperscriptsubscript𝜋0subscript𝜋𝑘\Pi^{*}=(\pi_{0}^{*},\dots,\pi_{k})roman_Π start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( italic_π start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , … , italic_π start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is a system of non-intersecting paths from I𝐼Iitalic_I to J{u0}{v0}pa(v0){u0}{v0}𝐽subscript𝑢0subscript𝑣0pasubscript𝑣0subscript𝑢0subscript𝑣0J\setminus\{u_{0}\}\cup\{v_{0}\}\subseteq\mathop{\rm pa}\nolimits(v_{0})% \setminus\{u_{0}\}\cup\{v_{0}\}italic_J ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ∪ { italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ⊆ roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ∪ { italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }. This proves that rpa(v0){u0}{v0}v0=rpa(v0)v0subscriptsuperscript𝑟subscript𝑣0pasubscript𝑣0subscript𝑢0subscript𝑣0subscriptsuperscript𝑟subscript𝑣0pasubscript𝑣0r^{v_{0}}_{\mathop{\rm pa}\nolimits(v_{0})\setminus\{u_{0}\}\cup\{v_{0}\}}=r^{% v_{0}}_{\mathop{\rm pa}\nolimits(v_{0})}italic_r start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ∪ { italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } end_POSTSUBSCRIPT = italic_r start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT, that implies rank((BΛ)pa(v0){u0}{v0},Rv0)=rank((BΛ)pa(v0),Rv0)ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑢0subscript𝑣0subscript𝑅subscript𝑣0ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑅subscript𝑣0\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{0})\setminus\{u% _{0}\}\cup\{v_{0}\},R_{v_{0}}})=\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa% }\nolimits(v_{0}),R_{v_{0}}})roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ∪ { italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ).

From Theorem 3.3, we know that rank((BΛ)pa(v0){u0},Rv0)rank((BΛ)pa(v0),Rv0)1ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑢0subscript𝑅subscript𝑣0ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑅subscript𝑣01\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{0})\setminus\{u% _{0}\},R_{v_{0}}})\geq\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}% \nolimits(v_{0}),R_{v_{0}}})-1roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ≥ roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) - 1 and that the two equality holds if only if λu0v0subscript𝜆subscript𝑢0subscript𝑣0\lambda_{u_{0}v_{0}}italic_λ start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is identifiable. Finally, we can write

rank((BΛ)pa(v0),Rv0)1rank((BΛ)pa(v0){u0},Rv0)ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑅subscript𝑣01ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑢0subscript𝑅subscript𝑣0\displaystyle\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{0}% ),R_{v_{0}}})-1\leq\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits% (v_{0})\setminus\{u_{0}\},R_{v_{0}}})roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) - 1 ≤ roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT )
rank((BΛ)pa(v0){u0}{v0},Rv0)=rank((BΛ)pa(v0),Rv0).absentranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑢0subscript𝑣0subscript𝑅subscript𝑣0ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑅subscript𝑣0\displaystyle\leq\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits(v% _{0})\setminus\{u_{0}\}\cup\{v_{0}\},R_{v_{0}}})=\operatorname{rank}((B_{% \Lambda})_{\mathop{\rm pa}\nolimits(v_{0}),R_{v_{0}}}).≤ roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ∪ { italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) .

This concludes the proof by noticing that rank((BΛ)pa(v0){u0},Rv0)=rank((BΛ)pa(v0){u0}{v0},Rv0)ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑢0subscript𝑅subscript𝑣0ranksubscriptsubscript𝐵Λpasubscript𝑣0subscript𝑢0subscript𝑣0subscript𝑅subscript𝑣0\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{0})\setminus\{u% _{0}\},R_{v_{0}}})=\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits% (v_{0})\setminus\{u_{0}\}\cup\{v_{0}\},R_{v_{0}}})roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∖ { italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ∪ { italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , italic_R start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) if and only if the first inequality is strict. ∎

It is well known that the presence of a valid instrumental variable is sufficient for estimating the causal effect from a treatment to an outcome (Wright, 1928, App. B). However, testing from data that an instrument is valid is a much more involved task. Indeed, Gunsilius (2021) shows that in the nonparametric case for continuous treatment, the IV model does not impose any constraint on the observed distribution. Developing tests for the validity of an instrument under different parametric assumptions is an active and important area of research; see, e.g., Pearl (1995); Silva and Shimizu (2017); Xie et al. (2022). The next example shows that, unlike the nonparametric case, the IV model does impose constraints on the observed distribution in linear models. However, our results prove that these constraints are not sufficient for testing the validity of an instrument.

Example 6.1 (Instrumental Validity).
I𝐼Iitalic_IT𝑇Titalic_TY𝑌Yitalic_YI𝐼Iitalic_IT𝑇Titalic_TY𝑌Yitalic_YI𝐼Iitalic_IT𝑇Titalic_TY𝑌Yitalic_YI𝐼Iitalic_IT𝑇Titalic_TY𝑌Yitalic_YI𝐼Iitalic_IT𝑇Titalic_TY𝑌Yitalic_Y
Figure 7: The IV graph (top row in the middle) with its equivalence class.

Let 𝒢~IVsubscript~𝒢𝐼𝑉\tilde{\mathcal{G}}_{IV}over~ start_ARG caligraphic_G end_ARG start_POSTSUBSCRIPT italic_I italic_V end_POSTSUBSCRIPT be the graph on the top left in Fig. 7. Applying Theorem 6.2 to this graph, and the edges IY𝐼𝑌I\to Yitalic_I → italic_Y and TY𝑇𝑌T\to Yitalic_T → italic_Y, one can see that 𝒢~IV,𝒢~IVIYsubscript~𝒢𝐼𝑉superscriptsubscript~𝒢𝐼𝑉𝐼𝑌\tilde{\mathcal{G}}_{IV},\tilde{\mathcal{G}}_{IV}^{I\to Y}over~ start_ARG caligraphic_G end_ARG start_POSTSUBSCRIPT italic_I italic_V end_POSTSUBSCRIPT , over~ start_ARG caligraphic_G end_ARG start_POSTSUBSCRIPT italic_I italic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I → italic_Y end_POSTSUPERSCRIPT, and 𝒢~IVTYsuperscriptsubscript~𝒢𝐼𝑉𝑇𝑌\tilde{\mathcal{G}}_{IV}^{T\to Y}over~ start_ARG caligraphic_G end_ARG start_POSTSUBSCRIPT italic_I italic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T → italic_Y end_POSTSUPERSCRIPT are all model equivalent. Applying the same argument to the graph on the bottom left in Fig. 7, and the edges YT𝑌𝑇Y\to Titalic_Y → italic_T and IT𝐼𝑇I\to Titalic_I → italic_T, we obtain that all the graphs depicted in Fig. 7 are model equivalent. Furthermore, by verifying the conditions of Theorem 6.1 for all ADMGs with three nodes using the software Macaulay2 (Grayson and Stillman, 2023), we confirm that the graphs depicted in Fig. 7 are the only ones in the equivalence class of the IV graph.

In particular, the equivalence of the IV graph 𝒢IVsubscript𝒢𝐼𝑉\mathcal{G}_{IV}caligraphic_G start_POSTSUBSCRIPT italic_I italic_V end_POSTSUBSCRIPT and 𝒢~IVsubscript~𝒢𝐼𝑉\tilde{\mathcal{G}}_{IV}over~ start_ARG caligraphic_G end_ARG start_POSTSUBSCRIPT italic_I italic_V end_POSTSUBSCRIPT implies that the so-called exclusion restriction (Lousdal, 2018) is not testable within our model class.

Moreover, through direct computation (for instance, by applying Robeva and Seby (2021, Cor. 21) to any of the graphs in Fig. 7), it can be shown that all the graphs in the equivalence class impose the following moment constraints on the observed distribution

ΣIT𝒯IIIΣII𝒯IIT=ΣIY𝒯IIIΣII𝒯IIY=0,subscriptΣ𝐼𝑇subscript𝒯𝐼𝐼𝐼subscriptΣ𝐼𝐼subscript𝒯𝐼𝐼𝑇subscriptΣ𝐼𝑌subscript𝒯𝐼𝐼𝐼subscriptΣ𝐼𝐼subscript𝒯𝐼𝐼𝑌0\Sigma_{IT}\mathcal{T}_{III}-\Sigma_{II}\mathcal{T}_{IIT}=\Sigma_{IY}\mathcal{% T}_{III}-\Sigma_{II}\mathcal{T}_{IIY}=0,roman_Σ start_POSTSUBSCRIPT italic_I italic_T end_POSTSUBSCRIPT caligraphic_T start_POSTSUBSCRIPT italic_I italic_I italic_I end_POSTSUBSCRIPT - roman_Σ start_POSTSUBSCRIPT italic_I italic_I end_POSTSUBSCRIPT caligraphic_T start_POSTSUBSCRIPT italic_I italic_I italic_T end_POSTSUBSCRIPT = roman_Σ start_POSTSUBSCRIPT italic_I italic_Y end_POSTSUBSCRIPT caligraphic_T start_POSTSUBSCRIPT italic_I italic_I italic_I end_POSTSUBSCRIPT - roman_Σ start_POSTSUBSCRIPT italic_I italic_I end_POSTSUBSCRIPT caligraphic_T start_POSTSUBSCRIPT italic_I italic_I italic_Y end_POSTSUBSCRIPT = 0 ,

where ΣX1X2=𝔼(X1X2)subscriptΣsubscript𝑋1subscript𝑋2𝔼subscript𝑋1subscript𝑋2\Sigma_{X_{1}X_{2}}=\mathbb{E}(X_{1}X_{2})roman_Σ start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = blackboard_E ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), and 𝒯X1X2X3=𝔼(X1X2X3)subscript𝒯subscript𝑋1subscript𝑋2subscript𝑋3𝔼subscript𝑋1subscript𝑋2subscript𝑋3\mathcal{T}_{X_{1}X_{2}X_{3}}=\mathbb{E}(X_{1}X_{2}X_{3})caligraphic_T start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = blackboard_E ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ).

7 Cyclic Graphs

Up to this point, we have exclusively studied acyclic models. This assumption has allowed us to obtain a complete characterization of the identifiable parameters. In this section, we relax this assumption and show that the proposed graphical criterion in Section 3 remains a necessary condition but is no longer sufficient. Moreover, we will provide a complete characterization of parameter identifiability for a special sub-class of cyclic graphs.

The first issue that one encounters when dealing with cyclic models is that the matrix (IΛ)𝐼Λ(I-\Lambda)( italic_I - roman_Λ ) might not be invertible. This implies that the assignment X=ΛTX+ε𝑋superscriptΛ𝑇𝑋𝜀X=\Lambda^{T}\cdot X+\varepsilonitalic_X = roman_Λ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ italic_X + italic_ε does not induce a unique solution for X𝑋Xitalic_X. Hence, we need to restrict our attention to a subset of 𝒢Dsuperscriptsubscript𝒢𝐷\mathbb{R}^{\mathcal{G}_{D}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, namely, the set reg𝒢D:={Λ𝒢D:det(IΛ)0}assignsubscriptsuperscriptsubscript𝒢𝐷regconditional-setΛsuperscriptsubscript𝒢𝐷𝐼Λ0\mathbb{R}^{\mathcal{G}_{D}}_{\mathrm{reg}}:=\{\Lambda\in\mathbb{R}^{\mathcal{% G}_{D}}\>:\>\det(I-\Lambda)\neq 0\}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT := { roman_Λ ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT : roman_det ( italic_I - roman_Λ ) ≠ 0 }. In this section, we focus on the identifiability of the matrix ΛΛ\Lambdaroman_Λ and reformulate the problem as follows.

Definition 7.1.

Define the parametrization map

Φ𝒢reg:reg𝒢D×(𝒢B):subscriptΦsubscript𝒢regsubscriptsuperscriptsubscript𝒢𝐷regsubscript𝒢𝐵\displaystyle\Phi_{{\mathcal{G}}_{\mathrm{reg}}}:\mathbb{R}^{\mathcal{G}_{D}}_% {\mathrm{reg}}\times\mathcal{M}(\mathcal{G}_{B})roman_Φ start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT × caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) (𝒢)absentabsent𝒢\displaystyle\xrightarrow{}\mathcal{M}(\mathcal{G})start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW caligraphic_M ( caligraphic_G )
(Λ,ε)Λ𝜀\displaystyle(\Lambda,\varepsilon)( roman_Λ , italic_ε ) (IΛ)Tε,maps-toabsentsuperscript𝐼Λ𝑇𝜀\displaystyle\mapsto(I-\Lambda)^{-T}\varepsilon,↦ ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_ε ,

and for every X(𝒢)𝑋𝒢X\in\mathcal{M}(\mathcal{G})italic_X ∈ caligraphic_M ( caligraphic_G ), let the fiber of X𝑋Xitalic_X with respect to Φ𝒢subscriptΦ𝒢\Phi_{\mathcal{G}}roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT be

Φ𝒢reg1(X):={(Λ,ε)reg𝒢D×(𝒢B):Φ𝒢reg(Λ,ε)=dX}.assignsuperscriptsubscriptΦsubscript𝒢reg1𝑋conditional-setΛ𝜀subscriptsuperscriptsubscript𝒢𝐷regsubscript𝒢𝐵superscript𝑑subscriptΦsubscript𝒢regΛ𝜀𝑋\Phi_{{\mathcal{G}}_{\mathrm{reg}}}^{-1}(X):=\{(\Lambda,\varepsilon)\in\mathbb% {R}^{\mathcal{G}_{D}}_{\mathrm{reg}}\times\mathcal{M}(\mathcal{G}_{B})\>:\>% \Phi_{{\mathcal{G}}_{\mathrm{reg}}}(\Lambda,\varepsilon)\stackrel{{% \scriptstyle d}}{{=}}X\}.roman_Φ start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_X ) := { ( roman_Λ , italic_ε ) ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT × caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) : roman_Φ start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( roman_Λ , italic_ε ) start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG italic_d end_ARG end_RELOP italic_X } . (7.1)

For any generic choice of (Λ,ε)Λ𝜀(\Lambda,\varepsilon)( roman_Λ , italic_ε ), let X=Φ𝒢(Λ,ε)𝑋subscriptΦ𝒢Λ𝜀X=\Phi_{\mathcal{G}}(\Lambda,\varepsilon)italic_X = roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( roman_Λ , italic_ε ). We say that the graph 𝒢𝒢\mathcal{G}caligraphic_G is generically identifiable if Preg𝒢(Φ𝒢reg1(X))={Λ}subscriptPsubscriptsuperscript𝒢regsuperscriptsubscriptΦsubscript𝒢reg1𝑋Λ\mathrm{P}_{\mathbb{R}^{\mathcal{G}}_{\mathrm{reg}}}(\Phi_{{\mathcal{G}}_{% \mathrm{reg}}}^{-1}(X))=\{\Lambda\}roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_X ) ) = { roman_Λ }.

Lemma 7.1.

Let X=Φ𝒢reg(Λ,ε)𝑋subscriptΦsubscript𝒢regΛ𝜀X=\Phi_{{\mathcal{G}}_{\mathrm{reg}}}(\Lambda,\varepsilon)italic_X = roman_Φ start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( roman_Λ , italic_ε ) for a generic choice of parameters (Λ,ε)reg𝒢D×(𝒢B)Λ𝜀subscriptsuperscriptsubscript𝒢𝐷regsubscript𝒢𝐵(\Lambda,\varepsilon)\in\mathbb{R}^{\mathcal{G}_{D}}_{\mathrm{reg}}\times% \mathcal{M}(\mathcal{G}_{B})( roman_Λ , italic_ε ) ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT × caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). The matrix Λ~reg𝒢D~Λsubscriptsuperscriptsubscript𝒢𝐷reg\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}_{D}}_{\mathrm{reg}}over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT belongs to Preg𝒢(Φ𝒢reg1(X))subscriptPsubscriptsuperscript𝒢regsuperscriptsubscriptΦsubscript𝒢reg1𝑋\mathrm{P}_{\mathbb{R}^{\mathcal{G}}_{\mathrm{reg}}}(\Phi_{{\mathcal{G}}_{% \mathrm{reg}}}^{-1}(X))roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT caligraphic_G start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_X ) ) if it is a solution to the following linear system of equations:

[(BΛ)pa(v),Rv]T(BΛ)vλ~pa(v),v=[(BΛ)v,Rv]T,vV.formulae-sequencesubscriptsuperscriptdelimited-[]subscriptsubscript𝐵Λpa𝑣subscript𝑅𝑣𝑇superscriptsubscript𝐵Λ𝑣subscript~𝜆pa𝑣𝑣superscriptdelimited-[]subscriptsubscript𝐵Λ𝑣subscript𝑅𝑣𝑇for-all𝑣𝑉\underbrace{[(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v),R_{v}}]^{T}}_{(B_{% \Lambda})^{v}}\cdot\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}=[(B_{% \Lambda})_{v,R_{v}}]^{T},\quad\forall v\in V.under⏟ start_ARG [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_POSTSUBSCRIPT ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT = [ ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v , italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , ∀ italic_v ∈ italic_V . (7.2)
Proof.

It suffices to notice that for the reverse implication of the proof of Lemma 3.2, we never used the acyclicity of the graph. Hence, the same proof applies. ∎

Theorem 7.2.

If a mixed graph 𝒢𝒢\mathcal{G}caligraphic_G is identifiable, then for every vV𝑣𝑉v\in Vitalic_v ∈ italic_V, there exists a subset Ivsubscript𝐼𝑣I_{v}italic_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT of Rvsubscript𝑅𝑣R_{v}italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT with the size |pa(v)|pa𝑣|\text{pa}(v)|| pa ( italic_v ) |, such that there is a system of non-intersecting paths from Ivsubscript𝐼𝑣I_{v}italic_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT to pa(v)pa𝑣\text{pa}(v)pa ( italic_v ).

Proof.

If there is a vV𝑣𝑉v\in Vitalic_v ∈ italic_V that satisfies the assumptions of the theorem, then the matrix (BΛ)vsuperscriptsubscript𝐵Λ𝑣(B_{\Lambda})^{v}( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT is rank deficient; hence there is λ~pa(v),vλpa(v),vsubscript~𝜆pa𝑣𝑣subscript𝜆pa𝑣𝑣\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}\neq\lambda_{\mathop{\rm pa}% \nolimits(v),v}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT ≠ italic_λ start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT that solves Eq. 7.2. The matrix Λ~~Λ\tilde{\Lambda}over~ start_ARG roman_Λ end_ARG obtained from ΛΛ\Lambdaroman_Λ by substituting the column corresponding to v𝑣vitalic_v with λ~pa(v),vsubscript~𝜆pa𝑣𝑣\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v),v}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v ) , italic_v end_POSTSUBSCRIPT, belongs to P𝒢reg(Φ𝒢1(X))subscriptPsubscriptsuperscript𝒢regsuperscriptsubscriptΦ𝒢1𝑋\mathrm{P}_{{\mathbb{R}^{\mathcal{G}}}_{\mathrm{reg}}}(\Phi_{\mathcal{G}}^{-1}% (X))roman_P start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT caligraphic_G end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_X ) ) according to Lemma 7.1. Hence, the matrix ΛΛ\Lambdaroman_Λ is not identifiable. ∎

Example 7.1 (A non-identifiable cyclic graph).

For the graph in Fig. 8, we have pa(v2)={v1,v3}pasubscript𝑣2subscript𝑣1subscript𝑣3\mathop{\rm pa}\nolimits(v_{2})=\{v_{1},v_{3}\}roman_pa ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT }, and Iv2={v1}subscript𝐼subscript𝑣2subscript𝑣1I_{v_{2}}=\{v_{1}\}italic_I start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT }. Considering v=v2𝑣subscript𝑣2v=v_{2}italic_v = italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Theorem 7.2, we can see that the matrix ΛΛ\Lambdaroman_Λ is not identifiable for this cyclic graph.

v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTv3subscript𝑣3v_{3}italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT
Figure 8: A non-identifiable cyclic graph.
Example 7.2 (Non-sufficiency of the graphical criterion).

Let 𝒢2subscript𝒢2\mathcal{G}_{2}caligraphic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT be the 2-cycle in Fig. 9. The matrix A𝐴Aitalic_A of Eq. 3.1 will have the following form:

A=[bv1v1λ~v2v1bv2v1bv1v2λ~v2v1bv2v2bv2v1λ~v1v2bv1v1bv2v2λ~v1v2bv1v2].𝐴matrixsubscript𝑏subscript𝑣1subscript𝑣1subscript~𝜆subscript𝑣2subscript𝑣1subscript𝑏subscript𝑣2subscript𝑣1subscript𝑏subscript𝑣1subscript𝑣2subscript~𝜆subscript𝑣2subscript𝑣1subscript𝑏subscript𝑣2subscript𝑣2subscript𝑏subscript𝑣2subscript𝑣1subscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣1subscript𝑏subscript𝑣2subscript𝑣2subscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣2A=\begin{bmatrix}b_{v_{1}v_{1}}-\tilde{\lambda}_{v_{2}v_{1}}b_{v_{2}v_{1}}&b_{% v_{1}v_{2}}-\tilde{\lambda}_{v_{2}v_{1}}b_{v_{2}v_{2}}\\ b_{v_{2}v_{1}}-\tilde{\lambda}_{v_{1}v_{2}}b_{v_{1}v_{1}}&b_{v_{2}v_{2}}-% \tilde{\lambda}_{v_{1}v_{2}}b_{v_{1}v_{2}}\end{bmatrix}.italic_A = [ start_ARG start_ROW start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] .

From 1 and the fact that there are no bidirected edges in the graph, we should have

av1v1av2v1=av1v2av2v2=0.subscript𝑎subscript𝑣1subscript𝑣1subscript𝑎subscript𝑣2subscript𝑣1subscript𝑎subscript𝑣1subscript𝑣2subscript𝑎subscript𝑣2subscript𝑣20a_{v_{1}v_{1}}\cdot a_{v_{2}v_{1}}=a_{v_{1}v_{2}}\cdot a_{v_{2}v_{2}}=0.italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0 .

Since the graph has cycles, we cannot rule out the possibility that the diagonal entries of A𝐴Aitalic_A are equal to zero. Hence, a valid solution is

λ~v1v2=bv2v2bv1v2=1λv2v1,λ~v2v1=bv1v1bv2v1=1λv1v2.formulae-sequencesubscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣2subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣21subscript𝜆subscript𝑣2subscript𝑣1subscript~𝜆subscript𝑣2subscript𝑣1subscript𝑏subscript𝑣1subscript𝑣1subscript𝑏subscript𝑣2subscript𝑣11subscript𝜆subscript𝑣1subscript𝑣2\tilde{\lambda}_{v_{1}v_{2}}=\frac{b_{v_{2}v_{2}}}{b_{v_{1}v_{2}}}=\frac{1}{% \lambda_{v_{2}v_{1}}},\quad\tilde{\lambda}_{v_{2}v_{1}}=\frac{b_{v_{1}v_{1}}}{% b_{v_{2}v_{1}}}=\frac{1}{\lambda_{v_{1}v_{2}}}.over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = divide start_ARG italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG , over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = divide start_ARG italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG .

This implies that the observed vector X=(X1,X2)𝑋subscript𝑋1subscript𝑋2X=(X_{1},X_{2})italic_X = ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) can be written in at least two different ways,

[1λv2v1λv1v21][ε1ε2],[11/λv1v21/λv2v11][(1/λv1v2)ε2(1/λv2v1)ε1],matrix1subscript𝜆subscript𝑣2subscript𝑣1subscript𝜆subscript𝑣1𝑣21matrixsubscript𝜀1subscript𝜀2matrix11subscript𝜆subscript𝑣1subscript𝑣21subscript𝜆subscript𝑣2subscript𝑣11matrix1subscript𝜆subscript𝑣1subscript𝑣2subscript𝜀21subscript𝜆subscript𝑣2subscript𝑣1subscript𝜀1\begin{bmatrix}1&-\lambda_{v_{2}v_{1}}\\ -\lambda_{v_{1}v2}&1\end{bmatrix}\begin{bmatrix}\varepsilon_{1}\\ \varepsilon_{2}\end{bmatrix},\quad\begin{bmatrix}1&-1/\lambda_{v_{1}v_{2}}\\ -1/\lambda_{v_{2}v_{1}}&1\end{bmatrix}\begin{bmatrix}(-1/\lambda_{v_{1}v_{2}})% \varepsilon_{2}\\ (-1/\lambda_{v_{2}v_{1}})\varepsilon_{1}\end{bmatrix},[ start_ARG start_ROW start_CELL 1 end_CELL start_CELL - italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v 2 end_POSTSUBSCRIPT end_CELL start_CELL 1 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] , [ start_ARG start_ROW start_CELL 1 end_CELL start_CELL - 1 / italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - 1 / italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL 1 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL ( - 1 / italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_ε start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ( - 1 / italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_ε start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] ,

that are both compatible with the graph 𝒢2subscript𝒢2\mathcal{G}_{2}caligraphic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In other words, the matrix ΛΛ\Lambdaroman_Λ is not identifiable.

𝒢2::subscript𝒢2absent\mathcal{G}_{2}:caligraphic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT :v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT𝒢k::subscript𝒢𝑘absent\mathcal{G}_{k}:caligraphic_G start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT :v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT\cdotsvk1subscript𝑣𝑘1v_{k-1}italic_v start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPTvksubscript𝑣𝑘v_{k}italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
Figure 9: On the left, a 2 cycle. On the right, a k𝑘kitalic_k-cycle
Lemma 7.3.

The k𝑘kitalic_k-cycle, depicted in Fig. 9, is generically identifiable if and only if k3𝑘3k\geq 3italic_k ≥ 3.

Proof.

From Example 7.2, we already know that 𝒢2subscript𝒢2\mathcal{G}_{2}caligraphic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is not identifiable. Hence, it is only left to show that 𝒢ksubscript𝒢𝑘\mathcal{G}_{k}caligraphic_G start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is identifiable for k3𝑘3k\geq 3italic_k ≥ 3. Herein, we present a proof for the case k=3𝑘3k=3italic_k = 3, as for larger cases, a similar argument would hold.

The matrix A𝐴Aitalic_A of Eq. 3.1, will have the following shape:

A=[bv1v1λ~v3v1bv3v1bv1v2λ~v3v1bv3v2bv1v3λ~v3v1bv3v3bv2v1λ~v1v2bv1v1bv2v2λ~v1v2bv1v2bv2v3λ~v1v2bv1v3bv3v1λ~v2v3bv2v1bv3v2λ~v2v3bv2v2bv3v3λ~v2v3bv2v3].𝐴matrixsubscript𝑏subscript𝑣1subscript𝑣1subscript~𝜆subscript𝑣3subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣1subscript𝑏subscript𝑣1subscript𝑣2subscript~𝜆subscript𝑣3subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣3subscript~𝜆subscript𝑣3subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣3subscript𝑏subscript𝑣2subscript𝑣1subscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣1subscript𝑏subscript𝑣2subscript𝑣2subscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣2subscript𝑣3subscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣3subscript𝑏subscript𝑣3subscript𝑣1subscript~𝜆subscript𝑣2subscript𝑣3subscript𝑏subscript𝑣2subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣2subscript~𝜆subscript𝑣2subscript𝑣3subscript𝑏subscript𝑣2subscript𝑣2subscript𝑏subscript𝑣3subscript𝑣3subscript~𝜆subscript𝑣2subscript𝑣3subscript𝑏subscript𝑣2subscript𝑣3A=\begin{bmatrix}b_{v_{1}v_{1}}-\tilde{\lambda}_{v_{3}v_{1}}b_{v_{3}v_{1}}&b_{% v_{1}v_{2}}-\tilde{\lambda}_{v_{3}v_{1}}b_{v_{3}v_{2}}&b_{v_{1}v_{3}}-\tilde{% \lambda}_{v_{3}v_{1}}b_{v_{3}v_{3}}\\ b_{v_{2}v_{1}}-\tilde{\lambda}_{v_{1}v_{2}}b_{v_{1}v_{1}}&b_{v_{2}v_{2}}-% \tilde{\lambda}_{v_{1}v_{2}}b_{v_{1}v_{2}}&b_{v_{2}v_{3}}-\tilde{\lambda}_{v_{% 1}v_{2}}b_{v_{1}v_{3}}\\ b_{v_{3}v_{1}}-\tilde{\lambda}_{v_{2}v_{3}}b_{v_{2}v_{1}}&b_{v_{3}v_{2}}-% \tilde{\lambda}_{v_{2}v_{3}}b_{v_{2}v_{2}}&b_{v_{3}v_{3}}-\tilde{\lambda}_{v_{% 2}v_{3}}b_{v_{2}v_{3}}\end{bmatrix}.italic_A = [ start_ARG start_ROW start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] . (7.3)

From 1 and the fact that there are no bidirected edges in the graph, we should have

{av1v1av2v1=av1v1av3v1=0av1v2av2v2=av3v2av2v2=0av1v3av3v3=av2v3av3v3=0av1v3av2v3=av1v2av3v2=av2v1av3v1=0.casesotherwisesubscript𝑎subscript𝑣1subscript𝑣1subscript𝑎subscript𝑣2subscript𝑣1subscript𝑎subscript𝑣1subscript𝑣1subscript𝑎subscript𝑣3subscript𝑣10otherwisesubscript𝑎subscript𝑣1subscript𝑣2subscript𝑎subscript𝑣2subscript𝑣2subscript𝑎subscript𝑣3subscript𝑣2subscript𝑎subscript𝑣2subscript𝑣20otherwisesubscript𝑎subscript𝑣1subscript𝑣3subscript𝑎subscript𝑣3subscript𝑣3subscript𝑎subscript𝑣2subscript𝑣3subscript𝑎subscript𝑣3subscript𝑣30otherwisesubscript𝑎subscript𝑣1subscript𝑣3subscript𝑎subscript𝑣2subscript𝑣3subscript𝑎subscript𝑣1subscript𝑣2subscript𝑎subscript𝑣3subscript𝑣2subscript𝑎subscript𝑣2subscript𝑣1subscript𝑎subscript𝑣3subscript𝑣10\begin{cases}&a_{v_{1}v_{1}}\cdot a_{v_{2}v_{1}}=a_{v_{1}v_{1}}\cdot a_{v_{3}v% _{1}}=0\\ &a_{v_{1}v_{2}}\cdot a_{v_{2}v_{2}}=a_{v_{3}v_{2}}\cdot a_{v_{2}v_{2}}=0\\ &a_{v_{1}v_{3}}\cdot a_{v_{3}v_{3}}=a_{v_{2}v_{3}}\cdot a_{v_{3}v_{3}}=0\\ &a_{v_{1}v_{3}}\cdot a_{v_{2}v_{3}}=a_{v_{1}v_{2}}\cdot a_{v_{3}v_{2}}=a_{v_{2% }v_{1}}\cdot a_{v_{3}v_{1}}=0.\end{cases}{ start_ROW start_CELL end_CELL start_CELL italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0 end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0 end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0 end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0 . end_CELL end_ROW (7.4)

If all the non-diagonal entries of A𝐴Aitalic_A are set to zero, we find Λ~=Λ~ΛΛ\tilde{\Lambda}=\Lambdaover~ start_ARG roman_Λ end_ARG = roman_Λ. We now show that this is the only solution for the system. Assume av2v10subscript𝑎subscript𝑣2subscript𝑣10a_{v_{2}v_{1}}\neq 0italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≠ 0, this implies av1v1=0subscript𝑎subscript𝑣1subscript𝑣10a_{v_{1}v_{1}}=0italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0. This leads to

λ~v3v1=bv1v1bv3v1,subscript~𝜆subscript𝑣3subscript𝑣1subscript𝑏subscript𝑣1subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣1\tilde{\lambda}_{v_{3}v_{1}}=\frac{b_{v_{1}v_{1}}}{b_{v_{3}v_{1}}},over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = divide start_ARG italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ,

plugging this value in Eq. 7.3, we obtain

A=[0bv1v2(bv1v1/bv3v1)bv3v2bv1v3(bv1v1/bv3v1)bv3v3bv2v1λ~v1v2bv1v1bv2v2λ~v1v2bv1v2bv2v3λ~v1v2bv1v3bv3v1λ~v2v3bv2v1bv3v2λ~v2v3bv2v2bv3v3λ~v2v3bv2v3].𝐴matrix0subscript𝑏subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣3subscript𝑏subscript𝑣1subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣3subscript𝑏subscript𝑣2subscript𝑣1subscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣1subscript𝑏subscript𝑣2subscript𝑣2subscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣2subscript𝑣3subscript~𝜆subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣3subscript𝑏subscript𝑣3subscript𝑣1subscript~𝜆subscript𝑣2subscript𝑣3subscript𝑏subscript𝑣2subscript𝑣1subscript𝑏subscript𝑣3subscript𝑣2subscript~𝜆subscript𝑣2subscript𝑣3subscript𝑏subscript𝑣2subscript𝑣2subscript𝑏subscript𝑣3subscript𝑣3subscript~𝜆subscript𝑣2subscript𝑣3subscript𝑏subscript𝑣2subscript𝑣3A=\begin{bmatrix}0&b_{v_{1}v_{2}}-({b_{v_{1}v_{1}}}/{b_{v_{3}v_{1}}})b_{v_{3}v% _{2}}&b_{v_{1}v_{3}}-({b_{v_{1}v_{1}}}/{b_{v_{3}v_{1}}})b_{v_{3}v_{3}}\\ b_{v_{2}v_{1}}-\tilde{\lambda}_{v_{1}v_{2}}b_{v_{1}v_{1}}&b_{v_{2}v_{2}}-% \tilde{\lambda}_{v_{1}v_{2}}b_{v_{1}v_{2}}&b_{v_{2}v_{3}}-\tilde{\lambda}_{v_{% 1}v_{2}}b_{v_{1}v_{3}}\\ b_{v_{3}v_{1}}-\tilde{\lambda}_{v_{2}v_{3}}b_{v_{2}v_{1}}&b_{v_{3}v_{2}}-% \tilde{\lambda}_{v_{2}v_{3}}b_{v_{2}v_{2}}&b_{v_{3}v_{3}}-\tilde{\lambda}_{v_{% 2}v_{3}}b_{v_{2}v_{3}}\end{bmatrix}.italic_A = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - ( italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT / italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - ( italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT / italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] . (7.5)

Writing explicitly the first row of the matrix in Eq. 7.5, one can see that

av1v2=det((BΛ){1,3},{1,2})bv3v2,av1v3=det((BΛ){1,3},{1,3})bv3v3,formulae-sequencesubscript𝑎subscript𝑣1subscript𝑣2subscriptsubscript𝐵Λ1312subscript𝑏subscript𝑣3subscript𝑣2subscript𝑎subscript𝑣1subscript𝑣3subscriptsubscript𝐵Λ1313subscript𝑏subscript𝑣3subscript𝑣3a_{v_{1}v_{2}}=\frac{\det((B_{\Lambda})_{\{1,3\},\{1,2\}})}{b_{v_{3}v_{2}}},% \quad a_{v_{1}v_{3}}=\frac{\det((B_{\Lambda})_{\{1,3\},\{1,3\}})}{b_{v_{3}v_{3% }}},italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = divide start_ARG roman_det ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT { 1 , 3 } , { 1 , 2 } end_POSTSUBSCRIPT ) end_ARG start_ARG italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG , italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = divide start_ARG roman_det ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT { 1 , 3 } , { 1 , 3 } end_POSTSUBSCRIPT ) end_ARG start_ARG italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ,

and both these quantities are different from zero for a generic choice of parameters in reg𝒢3subscriptsuperscriptsubscript𝒢3reg\mathbb{R}^{\mathcal{G}_{3}}_{\mathrm{reg}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT, see Lemma A.2. Having av1v20subscript𝑎subscript𝑣1subscript𝑣20a_{v_{1}v_{2}}\neq 0italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≠ 0, Eq. 7.4 implies that av2v2=0subscript𝑎subscript𝑣2subscript𝑣20a_{v_{2}v_{2}}=0italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0 and following the same argument as above, we conclude that av2v30subscript𝑎subscript𝑣2subscript𝑣30a_{v_{2}v_{3}}\neq 0italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≠ 0. Finally, we have av1v3av2v30subscript𝑎subscript𝑣1subscript𝑣3subscript𝑎subscript𝑣2subscript𝑣30a_{v_{1}v_{3}}\cdot a_{v_{2}v_{3}}\neq 0italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≠ 0 since both terms are non-zero, which contradicts Eq. 7.4. This proves that for a generic choice of parameters, the only solution of Eq. 7.4 is given by Λ~=Λ~ΛΛ\tilde{\Lambda}=\Lambdaover~ start_ARG roman_Λ end_ARG = roman_Λ. In other words, the matrix ΛΛ\Lambdaroman_Λ is generically identifiable. ∎

Remark 7.1.

It is noteworthy that Drton et al. (2011, Lemma 9) prove that k𝑘kitalic_k-cycles with k3𝑘3k\geq 3italic_k ≥ 3 are not generically identifiable from the covariance matrix alone.

Theorem 7.4.

Let 𝒢=(V,E,E=)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}}=\emptyset)caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT = ∅ ) be a directed graph, such that V=C1˙˙Cn𝑉subscript𝐶1˙˙subscript𝐶𝑛V=C_{1}\dot{\cup}\cdots\dot{\cup}C_{n}italic_V = italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT over˙ start_ARG ∪ end_ARG ⋯ over˙ start_ARG ∪ end_ARG italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, with Cisubscript𝐶𝑖C_{i}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT being a kisubscript𝑘𝑖k_{i}italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT-cycle, and pa(Ci)j=0iCjpasubscript𝐶𝑖superscriptsubscript𝑗0𝑖subscript𝐶𝑗\mathop{\rm pa}\nolimits(C_{i})\subseteq\bigcup_{j=0}^{i}C_{j}roman_pa ( italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ⊆ ⋃ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Then 𝒢𝒢\mathcal{G}caligraphic_G is generically identifiable if and only if for every cycle C={v1,v2}𝐶subscript𝑣1subscript𝑣2C=\{v_{1},v_{2}\}italic_C = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } of size 2222, we have pa(C)C=pa(vi)Cpa𝐶𝐶pasubscript𝑣𝑖𝐶\mathop{\rm pa}\nolimits(C)\setminus C=\mathop{\rm pa}\nolimits(v_{i})\setminus Croman_pa ( italic_C ) ∖ italic_C = roman_pa ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∖ italic_C for i{1,2}𝑖12i\in\{1,2\}italic_i ∈ { 1 , 2 }.

Proof.

8 Computational Experiments

8.1 Certifying Identifiability

We implemented the criterion from Theorem 4.3 using the algorithm of Dinits (1970) to solve the maximum flow problem. It operates with a complexity of 𝒪(|V|4)𝒪superscript𝑉4\mathcal{O}(|V|^{4})caligraphic_O ( | italic_V | start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ). Consequently, the algorithm we implemented has a complexity of 𝒪(|V|5)𝒪superscript𝑉5\mathcal{O}(|V|^{5})caligraphic_O ( | italic_V | start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT ). We then determine the proportion of identifiable randomly sampled ADMGs of size p=25,50𝑝2550p=25,50italic_p = 25 , 50 and with e=p,2p,,10p𝑒𝑝2𝑝10𝑝e=p,2p,\dots,10pitalic_e = italic_p , 2 italic_p , … , 10 italic_p edges. For each setup, we randomly sampled 5000500050005000 graphs. More details on how the graphs were generated are given in Section C.1.

The proportions are displayed in Fig. 10. We observe that for the given sampling scheme, most graphs yield identifiable models. The proportion of identifiable models remains similar across the two considered dimensions.

Refer to caption
Figure 10: Proportion ADMGs for which every entry of the matrix ΛΛ\Lambdaroman_Λ is generically identifiable.

8.2 Causal Effect Estimation

Herein, we present an optimization problem that can be used to infer the identifiable causal effects.

Lemma 8.1.

Let X=Φ𝒢(Λ,ε)(𝒢)𝑋subscriptΦ𝒢Λ𝜀𝒢X=\Phi_{\mathcal{G}}(\Lambda,\varepsilon)\in\mathcal{M}(\mathcal{G})italic_X = roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( roman_Λ , italic_ε ) ∈ caligraphic_M ( caligraphic_G ), for a generic choice of (Λ,ε)Λ𝜀(\Lambda,\varepsilon)( roman_Λ , italic_ε ), then Λ~𝒢D~Λsuperscriptsubscript𝒢𝐷\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}_{D}}over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT is a solution of Eq. 3.3 if and only if it is a solution of the optimization problem

minΛ~𝒢D{uv𝒢μ([(IΛ~)X]u,[(IΛ~)X]v)},subscript~Λsuperscriptsubscript𝒢𝐷subscriptabsent𝑢𝑣𝒢𝜇subscriptdelimited-[]𝐼~Λ𝑋𝑢subscriptdelimited-[]𝐼~Λ𝑋𝑣\min_{\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}_{D}}}\left\{\sum_{u% \xleftrightarrow{}v\notin\mathcal{G}}\mu\left([(I-\tilde{\Lambda})\cdot X]_{u}% ,[(I-\tilde{\Lambda})\cdot X]_{v}\right)\right\},roman_min start_POSTSUBSCRIPT over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_POSTSUBSCRIPT { ∑ start_POSTSUBSCRIPT italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∉ caligraphic_G end_POSTSUBSCRIPT italic_μ ( [ ( italic_I - over~ start_ARG roman_Λ end_ARG ) ⋅ italic_X ] start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT , [ ( italic_I - over~ start_ARG roman_Λ end_ARG ) ⋅ italic_X ] start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ) } , (8.1)

where μ(,)𝜇\mu(\cdot,\cdot)italic_μ ( ⋅ , ⋅ ) is any consistent measure of dependence, i.e., any nonnegative function that takes as input two random variables and returns zero if and only if the random variables are independent.

Proof.

For practical estimation we may form an empirical version of the problem in Eq. 8.1 by replacing the dependence measure μ𝜇\muitalic_μ with suitable consistent estimates. One natural choice for μ𝜇\muitalic_μ is mutual information (Cover and Thomas, 2006, §8.6). However, the most popular estimator for the mutual information is based on a k-nearest neighbor clustering of the sample, which would result in a non-smooth optimization problem (Kraskov et al., 2004). Several alternatives to mutual information have been proposed in the literature (Székely et al., 2007; Geenens and Lafaye de Micheaux, 2022; Shi et al., 2022). In particular, the Hilbert-Schmidt information criterion (HSIC) (Gretton et al., 2007) has been extensively applied in causal inference (Mooij et al., 2009; Saengkyongam et al., 2022). For our empirical study, we used the HSIC, but other measures of independence can also be implemented.

When the underlying kernel is characteristic, the HSIC provides a measure of dependence that vanishes if and only if the variables for which it is computed are independent (Fukumizu et al., 2007, §2.2 and Thm. 1). Moreover, a consistent estimator for the HSIC (Gretton et al., 2007) is given by

HSIC^n(X,Y):=tr(KXHKYH)/n2,assignsubscript^HSIC𝑛𝑋𝑌trsubscript𝐾𝑋𝐻subscript𝐾𝑌𝐻superscript𝑛2\widehat{\operatorname{HSIC}}_{n}(X,Y):=\operatorname{tr}(K_{X}HK_{Y}H)/n^{2},over^ start_ARG roman_HSIC end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_X , italic_Y ) := roman_tr ( italic_K start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT italic_H italic_K start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT italic_H ) / italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,

where Hi,j=δi,j1/nsubscript𝐻𝑖𝑗subscript𝛿𝑖𝑗1𝑛H_{i,j}=\delta_{i,j}-1/nitalic_H start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = italic_δ start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT - 1 / italic_n and KXsubscript𝐾𝑋K_{X}italic_K start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT, and KYsubscript𝐾𝑌K_{Y}italic_K start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT denote the respective Gram matrices.

For a fixed graph 𝒢𝒢\mathcal{G}caligraphic_G, and a given sample matrix Xn×p𝑋superscript𝑛𝑝X\in\mathbb{R}^{n\times p}italic_X ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_p end_POSTSUPERSCRIPT, we estimate ΛΛ\Lambdaroman_Λ as a solution to the following optimization problem

minΛ~𝒢D{uv𝒢HSIC^n([(IΛ~)X]u,[(IΛ~)X]v)}.subscript~Λsuperscriptsubscript𝒢𝐷subscriptabsent𝑢𝑣𝒢subscript^HSIC𝑛subscriptdelimited-[]𝐼~Λ𝑋𝑢subscriptdelimited-[]𝐼~Λ𝑋𝑣\min_{\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}_{D}}}\left\{\sum_{u% \xleftrightarrow{}v\notin\mathcal{G}}\widehat{\operatorname{HSIC}}_{n}\left([(% I-\tilde{\Lambda})\cdot X]_{u},[(I-\tilde{\Lambda})\cdot X]_{v}\right)\right\}.roman_min start_POSTSUBSCRIPT over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_POSTSUBSCRIPT { ∑ start_POSTSUBSCRIPT italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∉ caligraphic_G end_POSTSUBSCRIPT over^ start_ARG roman_HSIC end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( [ ( italic_I - over~ start_ARG roman_Λ end_ARG ) ⋅ italic_X ] start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT , [ ( italic_I - over~ start_ARG roman_Λ end_ARG ) ⋅ italic_X ] start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ) } . (8.2)

We used the L-BFGS method in (Liu and Nocedal, 1989) for solving the above optimization problem. We considered two types of kernels in our experiments: radial basis function (RBF) kernels, the results of which are presented in Fig. 13 in the Section C.2 and polynomial kernels, the results of which are depicted in Fig. 12. More details on the data generation, as well as additional experiments with different error distributions, can be found in Section C.1. It is noteworthy that in our experiments the polynomial kernels of degree 2 (Schölkopf and Smola, 2018, §2.3) provide a better estimate, and the results rely less on the initialization compared to the RBF kernels.

v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPTv2subscript𝑣2v_{2}italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPTv3subscript𝑣3v_{3}italic_v start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT
Figure 11: Double confounder graph.

In Fig. 12, we report the performance of our method on the IV graph (Fig. 1), the ADMG shown in Fig. 11, and the 3-cycle (Fig. 9). We used the normalized Frobenious loss between the estimated matrix Λ^^Λ\hat{\Lambda}over^ start_ARG roman_Λ end_ARG, and the true matrix ΛΛ\Lambdaroman_Λ, i.e., Λ^ΛF/ΛFsubscriptnorm^ΛΛ𝐹subscriptnormΛ𝐹||\hat{\Lambda}-\Lambda||_{F}/||\Lambda||_{F}| | over^ start_ARG roman_Λ end_ARG - roman_Λ | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT / | | roman_Λ | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT, as our loss function and reported the mean loss over fifty randomly sampled ΛΛ\Lambdaroman_Λ. We compared our method against the Empirical Likelihood (EL) estimator proposed in Wang and Drton (2017). Note that for the IV graph, the parameters are identifiable from the covariance matrix. Therefore, the EL estimator, which is a covariance-based method, outperforms our method as the covariance matrix estimator is more sample-efficient than the HSIC estimator. In contrast, for the ADMG in Fig. 11 and the 3-cycle, the performance of the EL estimator does not improve with the sample size. This is due to the fact that the parameters of these two mixed graphs cannot be determined solely from the covariance matrix; see Foygel et al. (2012, Prop. 2) and Drton et al. (2011, Lemma 9). When initialized ad the regression coefficient, the performance of our proposed estimator improves with the sample size. This indicates that the potential numerical issues arising from the non-convexity of the objective function and the estimation errors become less relevant as the sample size grows.

Refer to caption
Figure 12: The performance of our proposed estimator with a polynomial kernel of degree 2 for different initializations on IV graph, the ADMG of Fig. 11, and the 3-cycle. “EL,”“REG,” and“TV” represent the Empirical Likelihood, the regression coefficient, and the true value, respectively. Note that the y𝑦yitalic_y-axis is on a log scale.

9 Conclusions

In this work, we studied the generic identifiability of direct causal effects in linear structural equation models with dependent errors. For acyclic models, we obtained a complete graphical characterization of the identifiable causal effects, with a graphical criterion that can be checked in polynomial time in the size of the graph. For cyclic models, we proved that the same graphical conditions are necessary for identifiability, and we provided counter-examples to show that they are not sufficient. For a smaller family of cyclic models, we provided a complete graphical characterization of the identifiable effects. A complete characterization of the identifiability for cyclic models, however, involves additional mathematical subleties and is left as a problem for future work.

We also discussed the identifiability of the causal graph. For this problem, we provided an algorithm to test the model equivalence of two arbitrary ADMGs and a graphical characterization of the model equivalence for two graphs that only differ in the presence of a directed edge.

Most of the literature on identifiability in linear structural equation models leverages specific moment equations to obtain identifiability results. In this work, we follow a different approach and exploit the information contained in the whole distribution, explicitly leveraging the independence relations dictated by the missing bidirected edges in the graph. To the best of our knowledge, our work is the first to follow this route in this generality. In an initial exploration of parameter estimation we showed that estimates obtained by minimizing structurally absent dependences can be useful.

To conclude, we highlight possible future directions.

Beyond Observational Data.

In this paper, we considered the situation when only observational data is available. Recent identification results that additionally consider information from interventional datasets have been proposed for non-parametric models (Lee et al., 2020; Kivva et al., 2022). Extending our results to these setups can be seen as a natural future direction.

Non-linear Models.

In the graphical models literature, different non-parametric assumptions on the functional relations among the variables have been used to guarantee the identifiability of the causal structure (Peters et al., 2017, §7.1). Similar assumptions have been used to prove the identifiability of the causal effect under specific causal assumptions, e.g., Imbens and Newey (2009). However, a general graphical criterion for identification in non-linear structural equation models is currently missing. We believe that the ideas we propose in this work admit suitable extensions to these more general settings.

Structure Identifiability.

In Section 6, we provided a graphical characterization for the model equivalence of two ADMGs that only differ in the presence of an edge. A graphical characterization for the model equivalence of two arbitrary ADMGs is still an open problem. Its solution would be relevant for the development of algorithms for causal discovery from observational data that work under minimal linearity assumptions.

References

  • Ardiyansyah and Sodomaco (2023) Muhammad Ardiyansyah and Luca Sodomaco. Dimensions of higher order factor analysis models. Algebr. Stat., 14(1):91–108, 2023.
  • Barber et al. (2022) Rina Foygel Barber, Mathias Drton, Nils Sturma, and Luca Weihs. Half-trek criterion for identifiability of latent variable models. Ann. Statist., 50(6):3174–3196, 2022.
  • Brito (2004) Carlos Brito. Graphical models for identification in structural equation models. Ph.D. thesis, UCLA Computer Science Dept., 2004.
  • Chen et al. (2022) Li Chen, Rasmus Kyng, Yang P. Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva. Maximum flow and minimum-cost flow in almost-linear time. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31 - November 3, 2022, pages 612–623. IEEE, 2022.
  • Comon and Jutten (2010) Pierre Comon and Christian Jutten. Handbook of Blind Source Separation: Independent Component Analysis and Applications. Academic Press, Inc., USA, 1st edition, 2010.
  • Cormen et al. (2009) Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to algorithms. MIT Press, Cambridge, MA, third edition, 2009.
  • Cover and Thomas (2006) Thomas M. Cover and Joy A. Thomas. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, USA, 2006.
  • Cox et al. (2015) David A. Cox, John Little, and Donal O’Shea. Ideals, varieties, and algorithms. Undergraduate Texts in Mathematics. Springer, Cham, fourth edition, 2015. An introduction to computational algebraic geometry and commutative algebra.
  • di Dio and Schmüdgen (2022) Philipp J. di Dio and Konrad Schmüdgen. The multidimensional truncated moment problem: the moment cone. J. Math. Anal. Appl., 511(1):Paper No. 126066, 38, 2022.
  • Dinits (1970) E. A. Dinits. Algorithm for solution of a problem of maximum flow in a network with power estimation. Sov. Math., Dokl., 11:1277–1280, 1970. ISSN 0197-6788.
  • Drton (2018) Mathias Drton. Algebraic problems in structural equation modeling. In The 50th anniversary of Gröbner bases, volume 77 of Adv. Stud. Pure Math., pages 35–86. Math. Soc. Japan, Tokyo, 2018.
  • Drton and Maathuis (2017) Mathias Drton and Marloes H. Maathuis. Structure learning in graphical modeling. Annu. Rev. Stat. Appl., 4:365–393, 2017.
  • Drton and Richardson (2008) Mathias Drton and Thomas S. Richardson. Binary models for marginal independence. J. R. Stat. Soc. Ser. B Stat. Methodol., 70(2):287–309, 2008.
  • Drton et al. (2011) Mathias Drton, Rina Foygel, and Seth Sullivant. Global identifiability of linear structural equation models. Ann. Statist., 39(2):865–886, 2011.
  • Eriksson and Koivunen (2004) Jan Eriksson and Visa Koivunen. Identifiability, separability, and uniqueness of linear ICA models. IEEE Signal Process. Lett., 11(7):601–604, 2004.
  • Evans and Ringel (1999) William N. Evans and Jeanne S. Ringel. Can higher cigarette taxes improve birth outcomes? Journal of Public Economics, 72(1):135–154, 1999.
  • Foygel et al. (2012) Rina Foygel, Jan Draisma, and Mathias Drton. Half-trek criterion for generic identifiability of linear structural equation models. Ann. Statist., 40(3):1682–1713, 2012.
  • Fukumizu et al. (2007) Kenji Fukumizu, Arthur Gretton, Xiaohai Sun, and Bernhard Schölkopf. Kernel measures of conditional dependence. In Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007, pages 489–496. Curran Associates, Inc., 2007.
  • García-Puente et al. (2010) Luis D. García-Puente, Sarah Spielvogel, and Seth Sullivant. Identifying causal effects with computer algebra. In UAI 2010, Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, Catalina Island, CA, USA, July 8-11, 2010, pages 193–200. AUAI Press, 2010.
  • Garrote-López and Stephenson (2024) Marina Garrote-López and Monroe Stephenson. Cumulant tensors in partitioned independent component analysis. arXiv:2402.10089, 2024.
  • Geenens and Lafaye de Micheaux (2022) Gery Geenens and Pierre Lafaye de Micheaux. The Hellinger correlation. J. Amer. Statist. Assoc., 117(538):639–653, 2022.
  • Grayson and Stillman (2023) Daniel R. Grayson and Michael E. Stillman. Macaulay2, a software system for research in algebraic geometry. Available at http://www2.macaulay2.com, 2023.
  • Gretton et al. (2007) Arthur Gretton, Kenji Fukumizu, Choon Hui Teo, Le Song, Bernhard Schölkopf, and Alexander J. Smola. A kernel statistical test of independence. In Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007, pages 585–592. Curran Associates, Inc., 2007.
  • Gunsilius (2021) F. F. Gunsilius. Nontestability of instrument validity under continuous treatments. Biometrika, 108(4):989–995, 2021.
  • Imbens and Newey (2009) Guido Imbens and Whitney Newey. Identification and estimation of triangular simultaneous equations models without additivity. Econometrica, 77(5):1481–1512, 2009.
  • Kivva et al. (2022) Yaroslav Kivva, Ehsan Mokhtarian, Jalal Etesami, and Negar Kiyavash. Revisiting the general identifiability problem. In Uncertainty in Artificial Intelligence, Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, UAI 2022, 1-5 August 2022, Eindhoven, The Netherlands, volume 180 of Proceedings of Machine Learning Research, pages 1022–1030. PMLR, 2022.
  • Kivva et al. (2023a) Yaroslav Kivva, Jalal Etesami, and Negar Kiyavash. On identifiability of conditional causal effects. In Uncertainty in Artificial Intelligence, UAI 2023, July 31 - 4 August 2023, Pittsburgh, PA, USA, volume 216 of Proceedings of Machine Learning Research, pages 1078–1086. PMLR, 2023a.
  • Kivva et al. (2023b) Yaroslav Kivva, Saber Salehkaleybar, and Negar Kiyavash. A cross-moment approach for causal effect estimation. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023, 2023b.
  • Kraskov et al. (2004) Alexander Kraskov, Harald Stögbauer, and Peter Grassberger. Estimating mutual information. Phys. Rev. E (3), 69(6):066138, 16, 2004.
  • Kumor et al. (2020) Daniel Kumor, Carlos Cinelli, and Elias Bareinboim. Efficient identification in linear structural causal models with auxiliary cutsets. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 5501–5510. PMLR, 2020.
  • Lauritzen (1996) Steffen L. Lauritzen. Graphical Models. Oxford University Press, 1996.
  • Lee et al. (2020) Sanghack Lee, Juan D. Correa, and Elias Bareinboim. General identifiability with arbitrary surrogate experiments. In Ryan P. Adams and Vibhav Gogate, editors, Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, volume 115 of Proceedings of Machine Learning Research, pages 389–398. PMLR, 2020.
  • Lewicki and Sejnowski (2000) Michael S. Lewicki and Terrence J. Sejnowski. Learning overcomplete representations. Neural Comput., 12(2):337–365, 2000.
  • Liu and Nocedal (1989) Dong C. Liu and Jorge Nocedal. On the limited memory BFGS method for large scale optimization. Math. Programming, 45(3):503–528, 1989.
  • Liu et al. (2021) Yiheng Liu, Elina Robeva, and Huanqing Wang. Learning linear non-Gaussian graphical models with multidirected edges. J. Causal Inference, 9(1):250–263, 2021.
  • Lousdal (2018) Mette Lise Lousdal. An introduction to instrumental variable assumptions, validation and estimation. Emerging themes in epidemiology, 15(1):1, 2018.
  • Maathuis et al. (2019) Marloes Maathuis, Mathias Drton, Steffen Lauritzen, and Martin Wainwright, editors. Handbook of Graphical Models. Chapman & Hall/CRC Handbooks of Modern Statistical Methods. CRC Press, Boca Raton, FL, 2019.
  • McCullagh (1987) Peter McCullagh. Tensor methods in statistics. Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1987.
  • Mesters and Zwiernik (2022) Geert Mesters and Piotr Zwiernik. Non-independent components analysis. arXiv:2206.13668, 2022.
  • Michałek and Sturmfels (2021) Mateusz Michałek and Bernd Sturmfels. Invitation to nonlinear algebra, volume 211 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2021.
  • Mooij et al. (2009) Joris M. Mooij, Dominik Janzing, Jonas Peters, and Bernhard Schölkopf. Regression by dependence minimization and its application to causal inference in additive noise models. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, June 14-18, 2009, volume 382 of ACM International Conference Proceeding Series, pages 745–752. ACM, 2009.
  • Okamoto (1973) Masashi Okamoto. Distinctness of the eigenvalues of a quadratic form in a multivariate sample. Ann. Statist., 1:763–765, 1973.
  • Pearl (1995) Judea Pearl. On the testability of causal models with latent and instrumental variables. In Philippe Besnard and Steve Hanks, editors, UAI ’95: Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence, Montreal, Quebec, Canada, August 18-20, 1995, pages 435–443. Morgan Kaufmann, 1995.
  • Pearl (2009) Judea Pearl. Causality. Cambridge University Press, Cambridge, second edition, 2009. Models, reasoning, and inference.
  • Pearl (2017) Judea Pearl. A linear ‘microscope’ for interventions and counterfactuals. J. Causal Inference, 5(1):Art. No. 20170003, 2017.
  • Peters et al. (2017) Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Elements of Causal Inference. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, 2017. Foundations and learning algorithms.
  • Richardson (2003) Thomas Richardson. Markov properties for acyclic directed mixed graphs. Scand. J. Statist., 30(1):145–157, 2003.
  • Richardson and Spirtes (2002) Thomas Richardson and Peter Spirtes. Ancestral graph Markov models. Ann. Statist., 30(4):962–1030, 2002.
  • Richardson et al. (2023) Thomas S. Richardson, Robin J. Evans, James M. Robins, and Ilya Shpitser. Nested Markov properties for acyclic directed mixed graphs. Ann. Statist., 51(1):334–361, 2023.
  • Robeva and Seby (2021) Elina Robeva and Jean-Baptiste Seby. Multi-trek separation in linear structural equation models. SIAM Journal on Applied Algebra and Geometry, 5(2):278–303, 2021.
  • Saengkyongam et al. (2022) Sorawit Saengkyongam, Leonard Henckel, Niklas Pfister, and Jonas Peters. Exploiting independent instruments: Identification and distribution generalization. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pages 18935–18958. PMLR, 2022.
  • Salehkaleybar et al. (2020) Saber Salehkaleybar, AmirEmad Ghassami, Negar Kiyavash, and Kun Zhang. Learning linear non-Gaussian causal models in the presence of latent variables. J. Mach. Learn. Res., 21:Paper No. 39, 24, 2020.
  • Schkoda and Drton (2023) Daniela Schkoda and Mathias Drton. Goodness-of-fit tests for linear non-gaussian structural equation models. arXiv: 2311.04585, 2023.
  • Schölkopf and Smola (2018) Bernhard Schölkopf and Alexander J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. The MIT Press, 2018.
  • Shi et al. (2022) Hongjian Shi, Mathias Drton, and Fang Han. On the power of Chatterjee’s rank correlation. Biometrika, 109(2):317–333, 2022.
  • Shimizu (2022) Shōhei Shimizu. Statistical Causal Discovery: LiNGAM Approach. Springer, 2022.
  • Shpitser (2023) Ilya Shpitser. When does the id algorithm fail? arXiv:2307.03750, 2023.
  • Shpitser and Pearl (2006) Ilya Shpitser and Judea Pearl. Identification of joint interventional distributions in recursive semi-markovian causal models. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 2, AAAI’06, page 1219–1226. AAAI Press, 2006.
  • Shuai et al. (2023) Kang Shuai, Shanshan Luo, Yue Zhang, Feng Xie, and Yangbo He. Identification and estimation of causal effects using non-gaussianity and auxiliary covariates. arXiv:2304.14895, 2023.
  • Silva and Shimizu (2017) Ricardo Silva and Shohei Shimizu. Learning instrumental variables with structural and non-Gaussianity assumptions. J. Mach. Learn. Res., 18:Paper No. 120, 49, 2017.
  • Spirtes et al. (2000) Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, prediction, and search. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, second edition, 2000. With additional material by David Heckerman, Christopher Meek, Gregory F. Cooper and Thomas Richardson, A Bradford Book.
  • Sullivant et al. (2010) Seth Sullivant, Kelli Talaska, and Jan Draisma. Trek separation for Gaussian graphical models. Ann. Statist., 38(3):1665–1685, 2010.
  • Székely et al. (2007) Gábor J. Székely, Maria L. Rizzo, and Nail K. Bakirov. Measuring and testing dependence by correlation of distances. Ann. Statist., 35(6):2769–2794, 2007.
  • Verma and Pearl (1990) Thomas Verma and Judea Pearl. Equivalence and synthesis of causal models. In Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence, UAI ’90, page 255–270, USA, 1990. Elsevier Science Inc.
  • Wang and Seigal (2024) Kexin Wang and Anna Seigal. Identifiability of overcomplete independent component analysis. arXiv:2401.14709, 2024.
  • Wang and Drton (2017) Y. Samuel Wang and Mathias Drton. Empirical likelihood for linear structural equation models with dependent errors. Stat, 6:434–447, 2017.
  • Wang and Drton (2023) Y. Samuel Wang and Mathias Drton. Causal discovery with unobserved confounding and non-Gaussian data. J. Mach. Learn. Res., 24:Paper No. [271], 61, 2023.
  • Wright (1928) P.G. Wright. The Tariff on Animal and Vegetable Oils. Investigations in international commercial policies. Macmillan, 1928.
  • Xie et al. (2022) Feng Xie, Yangbo He, Zhi Geng, Zhengming Chen, Ru Hou, and Kun Zhang. Testability of instrumental variables in linear non-Gaussian acyclic causal models. Entropy, 24(4):Paper No. 512, 19, 2022.
  • Xie et al. (2023) Feng Xie, Biwei Huang, Zhengming Chen, Ruichu Cai, Clark Glymour, Zhi Geng, and Kun Zhang. Generalized independent noise condition for estimating causal structure with latent variables. arXiv:2308.06718, 2023.

Appendix A Notions of Non-Linear Algebra

In this section, we give the basic definitions of non-linear algebra we will need for the proofs; we defer the interested reader to Cox et al. (2015); Michałek and Sturmfels (2021) for more details.

Definition A.1.

For every natural number n𝑛nitalic_n, we denote the ring of polynomials in n𝑛nitalic_n variables x1,,xnsubscript𝑥1subscript𝑥𝑛x_{1},\dots,x_{n}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT by [x1,,xn]subscript𝑥1subscript𝑥𝑛\mathbb{R}[x_{1},\dots,x_{n}]blackboard_R [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ]. Let S𝑆Sitalic_S be a, possibly infinite, subset of [x1,,xn]subscript𝑥1subscript𝑥𝑛\mathbb{R}[x_{1},\dots,x_{n}]blackboard_R [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ]. The affine variety associated to S𝑆Sitalic_S is defined as 𝒱(S)={xn:f(x)=0,fS}𝒱𝑆conditional-set𝑥superscript𝑛formulae-sequence𝑓𝑥0for-all𝑓𝑆\mathcal{V}(S)=\{x\in\mathbb{R}^{n}\>:\>f(x)=0,\,\forall f\in S\}caligraphic_V ( italic_S ) = { italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : italic_f ( italic_x ) = 0 , ∀ italic_f ∈ italic_S }. The vanishing ideal associated to a variety 𝒱n𝒱superscript𝑛\mathcal{V}\subseteq\mathbb{R}^{n}caligraphic_V ⊆ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is (𝒱)={f[x1,,xn]:f(x)=0.x𝒱}𝒱conditional-set𝑓subscript𝑥1subscript𝑥𝑛formulae-sequence𝑓𝑥0for-all𝑥𝒱\mathcal{I}(\mathcal{V})=\{f\in\mathbb{R}[x_{1},\dots,x_{n}]\>:\>f(x)=0.\,% \forall x\in\mathcal{V}\}caligraphic_I ( caligraphic_V ) = { italic_f ∈ blackboard_R [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] : italic_f ( italic_x ) = 0 . ∀ italic_x ∈ caligraphic_V }, and the coordinate ring of 𝒱𝒱\mathcal{V}caligraphic_V is defined as [𝒱]=[x1,,xn]/(𝒱)delimited-[]𝒱subscript𝑥1subscript𝑥𝑛𝒱\mathbb{R}[\mathcal{V}]=\mathbb{R}[x_{1},\dots,x_{n}]/\mathcal{I}(\mathcal{V})blackboard_R [ caligraphic_V ] = blackboard_R [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] / caligraphic_I ( caligraphic_V ).

Lemma A.1.

Sullivant et al. (2010, Prop. 3.1) Let 𝒢Dsubscript𝒢𝐷\mathcal{G}_{D}caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT be any directed graph. For every Λreg𝒢DΛsubscriptsuperscriptsubscript𝒢𝐷reg\Lambda\in\mathbb{R}^{\mathcal{G}_{D}}_{\mathrm{reg}}roman_Λ ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT we have

(BΛ)uv=(IΛ)uvT=P𝒫(v,u)λP,subscriptsubscript𝐵Λ𝑢𝑣subscriptsuperscript𝐼Λ𝑇𝑢𝑣subscript𝑃𝒫𝑣𝑢superscript𝜆𝑃(B_{\Lambda})_{uv}=(I-\Lambda)^{-T}_{uv}=\sum_{P\in\mathcal{P}(v,u)}\lambda^{P},( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT = ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_P ∈ caligraphic_P ( italic_v , italic_u ) end_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ,

in particular (BΛ)uv=0subscriptsubscript𝐵Λ𝑢𝑣0(B_{\Lambda})_{uv}=0( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT = 0 if ude(v)𝑢de𝑣u\notin\mathop{\rm de}\nolimits(v)italic_u ∉ roman_de ( italic_v ).

Lemma A.2.

Sullivant et al. (2010, Lem. 3.3), Foygel et al. (Supplement 2012, Lem. 1) Let 𝒢Dsubscript𝒢𝐷\mathcal{G}_{D}caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT be any directed graph, and let I,J𝐼𝐽I,Jitalic_I , italic_J be two subsets of V𝑉Vitalic_V of the same size. Then for every Λreg𝒢DΛsubscriptsuperscriptsubscript𝒢𝐷reg\Lambda\in\mathbb{R}^{\mathcal{G}_{D}}_{\mathrm{reg}}roman_Λ ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT we have:

det((BΛ)I,J=det(IΛ)I,J1=Π𝒫(I,J)|σΠ|λΠ=Π𝒫~(I,J)|σΠ|λΠ,\det((B_{\Lambda})_{I,J}=\det(I-\Lambda)^{-1}_{I,J}=\sum_{\Pi\in\mathcal{P}(I,% J)}|\sigma_{\Pi}|\lambda^{\Pi}=\sum_{\Pi\in\tilde{\mathcal{P}}(I,J)}|\sigma_{% \Pi}|\lambda^{\Pi},roman_det ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_I , italic_J end_POSTSUBSCRIPT = roman_det ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_I , italic_J end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT roman_Π ∈ caligraphic_P ( italic_I , italic_J ) end_POSTSUBSCRIPT | italic_σ start_POSTSUBSCRIPT roman_Π end_POSTSUBSCRIPT | italic_λ start_POSTSUPERSCRIPT roman_Π end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT roman_Π ∈ over~ start_ARG caligraphic_P end_ARG ( italic_I , italic_J ) end_POSTSUBSCRIPT | italic_σ start_POSTSUBSCRIPT roman_Π end_POSTSUBSCRIPT | italic_λ start_POSTSUPERSCRIPT roman_Π end_POSTSUPERSCRIPT ,

where |σΠ|subscript𝜎Π|\sigma_{\Pi}|| italic_σ start_POSTSUBSCRIPT roman_Π end_POSTSUBSCRIPT | denotes the sign of the permutation. In particular, 𝒫~(I,J)=~𝒫𝐼𝐽\tilde{\mathcal{P}}(I,J)=\emptysetover~ start_ARG caligraphic_P end_ARG ( italic_I , italic_J ) = ∅ implies det((BΛ)I,J=0\det((B_{\Lambda})_{I,J}=0roman_det ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_I , italic_J end_POSTSUBSCRIPT = 0. The reverse implications holds for a generic choice of Λreg𝒢DΛsubscriptsuperscriptsubscript𝒢𝐷reg\Lambda\in~{}\mathbb{R}^{\mathcal{G}_{D}}_{\mathrm{reg}}roman_Λ ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT, i.e., for any ΛΛ\Lambdaroman_Λ outside a Lebesgue measure 0 subset of reg𝒢Dsubscriptsuperscriptsubscript𝒢𝐷reg\mathbb{R}^{\mathcal{G}_{D}}_{\mathrm{reg}}blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_reg end_POSTSUBSCRIPT.

Appendix B Proofs

B.1 Proofs for Section 4

Proof of Lemma 4.1.

First, notice that for all practical purposes, we can consider the edge capacity to be |pa(v)|pa𝑣|\mathop{\rm pa}\nolimits(v)|| roman_pa ( italic_v ) | instead of \infty; this implies that we can exploit addition properties of maximum flow problems with integer values.

We are going to show that to every flow f𝑓fitalic_f in GQvsubscriptsuperscript𝐺𝑣𝑄G^{v}_{Q}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT of size k𝑘kitalic_k with integer values, we can associated (If,Pf)2Rv×2Qsubscript𝐼𝑓subscript𝑃𝑓superscript2subscript𝑅𝑣superscript2𝑄(I_{f},P_{f})\in 2^{R_{v}}\times 2^{Q}( italic_I start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) ∈ 2 start_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × 2 start_POSTSUPERSCRIPT italic_Q end_POSTSUPERSCRIPT and a system of paths Πf𝒫~(If,Pf)subscriptΠ𝑓~𝒫subscript𝐼𝑓subscript𝑃𝑓\Pi_{f}\in\tilde{\mathcal{P}}(I_{f},P_{f})roman_Π start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ∈ over~ start_ARG caligraphic_P end_ARG ( italic_I start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) such that |If|=ksubscript𝐼𝑓𝑘|I_{f}|=k| italic_I start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT | = italic_k, and vice versa. That is, for every pair (I,P)2Rv×2Q𝐼𝑃superscript2subscript𝑅𝑣superscript2𝑄(I,P)\in 2^{R_{v}}\times 2^{Q}( italic_I , italic_P ) ∈ 2 start_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × 2 start_POSTSUPERSCRIPT italic_Q end_POSTSUPERSCRIPT and a system of paths Π=(π1,,πk)𝒫~(I,P)Πsubscript𝜋1subscript𝜋𝑘~𝒫𝐼𝑃\Pi=(\pi_{1},\dots,\pi_{k})\in\tilde{\mathcal{P}}(I,P)roman_Π = ( italic_π start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_π start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∈ over~ start_ARG caligraphic_P end_ARG ( italic_I , italic_P ) such that |I|=k𝐼𝑘|I|=k| italic_I | = italic_k, we can associate an integer flow fI,Psubscript𝑓𝐼𝑃f_{I,P}italic_f start_POSTSUBSCRIPT italic_I , italic_P end_POSTSUBSCRIPT of size k𝑘kitalic_k.

Let us first consider a flow f𝑓fitalic_f with integer value in GQvsubscriptsuperscript𝐺𝑣𝑄G^{v}_{Q}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT. Since the capacity of each node, that is not a sink or a source, is 1111, we can restrict the image of f𝑓fitalic_f to be {0,1}01\{0,1\}{ 0 , 1 }. Define If={uRv:f(sv,u)=1}subscript𝐼𝑓conditional-set𝑢subscript𝑅𝑣𝑓subscript𝑠𝑣𝑢1I_{f}=\{u\in R_{v}\>:\>f(s_{v},u)=1\}italic_I start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = { italic_u ∈ italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT : italic_f ( italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT , italic_u ) = 1 }. Since the size of the flow is k𝑘kitalic_k, we have |If|=ksubscript𝐼𝑓𝑘|I_{f}|=k| italic_I start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT | = italic_k. For every uIf𝑢subscript𝐼𝑓u\in I_{f}italic_u ∈ italic_I start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT consider the path πu:u=:u0,u1,,uku\pi_{u}:u=:u_{0},u_{1},\dots,u_{k_{u}}italic_π start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT : italic_u = : italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_u start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_POSTSUBSCRIPT such that f(ui,ui+1)=1𝑓subscript𝑢𝑖subscript𝑢𝑖11f(u_{i},u_{i+1})=1italic_f ( italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ) = 1 and ukuQsubscript𝑢subscript𝑘𝑢𝑄u_{k_{u}}\in Qitalic_u start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ italic_Q. This is well defined since for every uiVv{sv,tv}subscript𝑢𝑖subscript𝑉𝑣subscript𝑠𝑣subscript𝑡𝑣u_{i}\in V_{v}\setminus\{s_{v},t_{v}\}italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_V start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ∖ { italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT } there is at most one other ui+1Vv{sv,tv}subscript𝑢𝑖1subscript𝑉𝑣subscript𝑠𝑣subscript𝑡𝑣u_{i+1}\in V_{v}\setminus\{s_{v},t_{v}\}italic_u start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ∈ italic_V start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ∖ { italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT } such that f(ui,ui+1)=1𝑓subscript𝑢𝑖subscript𝑢𝑖11f(u_{i},u_{i+1})=1italic_f ( italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ) = 1, and if one assumes that there is an ui1subscript𝑢𝑖1u_{i-1}italic_u start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT such that f(ui1,ui)=1𝑓subscript𝑢𝑖1subscript𝑢𝑖1f(u_{i-1},u_{i})=1italic_f ( italic_u start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = 1 then the existence of ui+1subscript𝑢𝑖1u_{i+1}italic_u start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT is guaranteed from the first equality in Eq. 4.1. Let Pf={uku:uIf}subscript𝑃𝑓conditional-setsubscript𝑢subscript𝑘𝑢𝑢subscript𝐼𝑓P_{f}=\{u_{k_{u}}\>:\>u\in I_{f}\}italic_P start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = { italic_u start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_POSTSUBSCRIPT : italic_u ∈ italic_I start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT }. By contradiction, we prove that Π=(πu:uIv)𝒫~(If,Pf)\Pi=(\pi_{u}\>:\>u\in I_{v})\in\tilde{\mathcal{P}}(I_{f},P_{f})roman_Π = ( italic_π start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT : italic_u ∈ italic_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ) ∈ over~ start_ARG caligraphic_P end_ARG ( italic_I start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , italic_P start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ). Suppose two paths in ΠΠ\Piroman_Π intersect at a node u𝑢uitalic_u. This implies that there are u0u1Vvsubscript𝑢0subscript𝑢1subscript𝑉𝑣u_{0}\neq u_{1}\in V_{v}italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≠ italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ italic_V start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT such that f(u0,u)=f(u1,u)=1𝑓subscript𝑢0𝑢𝑓subscript𝑢1𝑢1f(u_{0},u)=f(u_{1},u)=1italic_f ( italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_u ) = italic_f ( italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_u ) = 1, hence wVvf(w,u)2>cV(u)subscript𝑤subscript𝑉𝑣𝑓𝑤𝑢2subscript𝑐𝑉𝑢\sum_{w\in V_{v}}f(w,u)\geq 2>c_{V}(u)∑ start_POSTSUBSCRIPT italic_w ∈ italic_V start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( italic_w , italic_u ) ≥ 2 > italic_c start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT ( italic_u ). We obtain a violation of Eq. 4.1.

For the other implication, consider (I,P)2Rv×2Q𝐼𝑃superscript2subscript𝑅𝑣superscript2𝑄(I,P)\in 2^{R_{v}}\times 2^{Q}( italic_I , italic_P ) ∈ 2 start_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × 2 start_POSTSUPERSCRIPT italic_Q end_POSTSUPERSCRIPT and a system of paths Π=(π1,,πk)𝒫~(I,P)Πsubscript𝜋1subscript𝜋𝑘~𝒫𝐼𝑃\Pi=(\pi_{1},\dots,\pi_{k})\in\tilde{\mathcal{P}}(I,P)roman_Π = ( italic_π start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_π start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∈ over~ start_ARG caligraphic_P end_ARG ( italic_I , italic_P ) such that |I|=k𝐼𝑘|I|=k| italic_I | = italic_k. We define fI,Psubscript𝑓𝐼𝑃f_{I,P}italic_f start_POSTSUBSCRIPT italic_I , italic_P end_POSTSUBSCRIPT as follows:

fI,P(u,w)={1,if u=sv and wI oder uP and v=tv,1,j[k]:uvπj,0,otherwise.subscript𝑓𝐼𝑃𝑢𝑤cases1if 𝑢subscript𝑠𝑣 and 𝑤𝐼 oder 𝑢𝑃 and 𝑣subscript𝑡𝑣1:𝑗delimited-[]𝑘𝑢𝑣subscript𝜋𝑗0otherwisef_{I,P}(u,w)=\begin{cases}1,\qquad&\emph{if }u=s_{v}\emph{ and }w\in I\emph{ % or }u\in P\emph{ and }v=t_{v},\\ 1,\qquad&\exists\,j\in[k]\>:\>u\to v\in\pi_{j},\\ 0,\qquad&\emph{otherwise}.\end{cases}italic_f start_POSTSUBSCRIPT italic_I , italic_P end_POSTSUBSCRIPT ( italic_u , italic_w ) = { start_ROW start_CELL 1 , end_CELL start_CELL if italic_u = italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT and italic_w ∈ italic_I or italic_u ∈ italic_P and italic_v = italic_t start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL 1 , end_CELL start_CELL ∃ italic_j ∈ [ italic_k ] : italic_u → italic_v ∈ italic_π start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW (B.1)

We need to show that fI,Psubscript𝑓𝐼𝑃f_{I,P}italic_f start_POSTSUBSCRIPT italic_I , italic_P end_POSTSUBSCRIPT satisfies Eq. 4.1. Since the capacity of each edge is infinity, we only need to check the first inequality; this holds because ΠΠ\Piroman_Π is a non-intersecting system of paths, and so each node has at most one incoming, outgoing, for which the flow is different from 0. By directly plugging in Eq. B.1 into Eq. 4.2, it is straightforward to show that |fI,P|=ksubscript𝑓𝐼𝑃𝑘|f_{I,P}|=k| italic_f start_POSTSUBSCRIPT italic_I , italic_P end_POSTSUBSCRIPT | = italic_k.

To conclude the proof, we need to show that there is a solution to GQvsubscriptsuperscript𝐺𝑣𝑄G^{v}_{Q}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT with integer values. This is ensured by applying Cormen et al. (2009, Thm. 26.10) and the fact that that all the capacities in GQvsubscriptsuperscript𝐺𝑣𝑄G^{v}_{Q}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT are integers. ∎

Proof of Theorem 4.2.

Chen et al. (2022) proved that the complexity of any maximum flow problem (G,s,t,cV,cD)𝐺𝑠𝑡subscript𝑐𝑉subscript𝑐𝐷(G,s,t,c_{V},c_{D})( italic_G , italic_s , italic_t , italic_c start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ) is almost linear in the number of edges in the graph G𝐺Gitalic_G. For every node vV𝑣𝑉v\in Vitalic_v ∈ italic_V and Qpa(v)𝑄pa𝑣Q\subseteq\mathop{\rm pa}\nolimits(v)italic_Q ⊆ roman_pa ( italic_v ) to certify the identifiability of λQ,vsubscript𝜆𝑄𝑣\lambda_{Q,v}italic_λ start_POSTSUBSCRIPT italic_Q , italic_v end_POSTSUBSCRIPT, one needs to solve the maximum flow problems GQvsubscriptsuperscript𝐺𝑣𝑄G^{v}_{Q}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT and Gpa(v)vsubscriptsuperscript𝐺𝑣pa𝑣G^{v}_{\mathop{\rm pa}\nolimits(v)}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) end_POSTSUBSCRIPT and then check whether the difference of the sizes of the corresponding maximum flows is |Q|𝑄|Q|| italic_Q |. Since both GQvsubscriptsuperscript𝐺𝑣𝑄G^{v}_{Q}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT and Gpa(v)vsubscriptsuperscript𝐺𝑣pa𝑣G^{v}_{\mathop{\rm pa}\nolimits(v)}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) end_POSTSUBSCRIPT have at most 2(|V|+1)2𝑉12(|V|+1)2 ( | italic_V | + 1 ) nodes, the overall complexity is 𝒪(|V|2+o(1))𝒪superscript𝑉2𝑜1\mathcal{O}(|V|^{2+o(1)})caligraphic_O ( | italic_V | start_POSTSUPERSCRIPT 2 + italic_o ( 1 ) end_POSTSUPERSCRIPT ). ∎

Proof of Theorem 4.3.

To certify the identifiability of all the directed edges, i.e., the whole matrix, one needs to solve the maximum flow problem Gpa(v)vsubscriptsuperscript𝐺𝑣pa𝑣G^{v}_{\mathop{\rm pa}\nolimits(v)}italic_G start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_pa ( italic_v ) end_POSTSUBSCRIPT for every v𝑣vitalic_v in V𝑉Vitalic_V and check whether the maximum flow has the size |pa(v)|pa𝑣|\mathop{\rm pa}\nolimits(v)|| roman_pa ( italic_v ) |. This adds a multiplicative factor |V|𝑉|V|| italic_V | to the result of Theorem 4.2 which leads to 𝒪(|V|3+o(1))𝒪superscript𝑉3𝑜1\mathcal{O}(|V|^{3+o(1)})caligraphic_O ( | italic_V | start_POSTSUPERSCRIPT 3 + italic_o ( 1 ) end_POSTSUPERSCRIPT ). ∎

B.2 Proofs for Section 5.2

In the sequel, we will consider k(𝒢B)superscriptabsent𝑘subscript𝒢𝐵\mathcal{M}^{\leq k}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) as a variety in the space given by the Cartesian product of the symmetric tensor spaces (Syml(p))2lksubscriptsubscriptSym𝑙superscript𝑝2𝑙𝑘(\operatorname{Sym}_{l}(\mathbb{R}^{p}))_{2\leq l\leq k}( roman_Sym start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ) ) start_POSTSUBSCRIPT 2 ≤ italic_l ≤ italic_k end_POSTSUBSCRIPT, which is isomorphic to s=2k(p+s1s)superscriptsuperscriptsubscript𝑠2𝑘binomial𝑝𝑠1𝑠\mathbb{R}^{\sum_{s=2}^{k}\binom{p+s-1}{s}}blackboard_R start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_s = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_p + italic_s - 1 end_ARG start_ARG italic_s end_ARG ) end_POSTSUPERSCRIPT. We denote the corresponding coordinate ring as [k(𝒢B)]delimited-[]superscriptabsent𝑘subscript𝒢𝐵\mathbb{R}[\mathcal{M}^{\leq k}(\mathcal{G}_{B})]blackboard_R [ caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ]. For every k𝑘kitalic_k-tuple (i1,,ik)subscript𝑖1subscript𝑖𝑘(i_{1},\dots,i_{k})( italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), we denote by (i,,i)(k)(𝒢B)subscriptsuperscript𝑘𝑖𝑖subscript𝒢𝐵\mathcal{M}^{(k)}_{\setminus(i,\dots,i)}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) the projection of (k)(𝒢B)superscript𝑘subscript𝒢𝐵\mathcal{M}^{(k)}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) on the coordinates not corresponding to the entry (i1,,ik)subscript𝑖1subscript𝑖𝑘(i_{1},\dots,i_{k})( italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ).

Proof of Lemma 5.4.

The fact that ϕksuperscriptitalic-ϕ𝑘\phi^{k}italic_ϕ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT is well defined is a consequence of Comon and Jutten (2010, Prop. 3.1).

There is a one-to-one linear transformation between cumulants and moments; see, e.g., McCullagh (1987, §2.3); hence, it is enough to prove the result for the corresponding set of moments. It is known that the set of symmetric tensors that can be generated as a moment of a distribution is a full dimensional convex cone in the space of symmetric tensors, see, e.g., di Dio and Schmüdgen (2022, Lem. 3.3). Hence, the same result holds for the set of cumulants, ϕk((𝒢B))superscriptitalic-ϕabsent𝑘subscriptsubscript𝒢𝐵\phi^{\leq k}(\mathcal{M}_{\infty}(\mathcal{G}_{B}))italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) is the projection of this convex cone along the coordinate axes corresponding to connected subsets of 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT, so is itself a full dimensional convex cone in k(𝒢B)superscriptabsent𝑘subscript𝒢𝐵\mathcal{M}^{\leq k}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). ∎

Lemma B.1.

Let ε(𝒢B)𝜀subscriptsubscript𝒢𝐵\varepsilon\in\mathcal{M}_{\infty}(\mathcal{G}_{B})italic_ε ∈ caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), and A2×p𝐴superscript2𝑝A\in\mathbb{R}^{2\times p}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT 2 × italic_p end_POSTSUPERSCRIPT, then

𝒞(k)(Aε)i1,,ik={j1,,jk} is connected in 𝒢B𝒞(k)(ε)j1,,jkaiij1aikjjsuperscript𝒞𝑘subscript𝐴𝜀subscript𝑖1subscript𝑖𝑘subscriptsubscript𝑗1subscript𝑗𝑘 is connected in subscript𝒢𝐵superscript𝒞𝑘subscript𝜀subscript𝑗1subscript𝑗𝑘subscript𝑎subscript𝑖𝑖subscript𝑗1subscript𝑎subscript𝑖𝑘subscript𝑗𝑗\mathcal{C}^{(k)}(A\cdot\varepsilon)_{i_{1},\dots,i_{k}}=\sum_{\begin{subarray% }{c}\{j_{1},\dots,j_{k}\}\emph{ is }\\ \emph{connected in }\mathcal{G}_{B}\end{subarray}}\mathcal{C}^{(k)}(% \varepsilon)_{j_{1},\dots,j_{k}}a_{i_{i}j_{1}}\cdots a_{i_{k}j_{j}}caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_A ⋅ italic_ε ) start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL { italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } is end_CELL end_ROW start_ROW start_CELL connected in caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_CELL end_ROW end_ARG end_POSTSUBSCRIPT caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋯ italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT
Proof.

A direct consequence of Lemma 5.3 and Lemma 5.4. ∎

Proof of Theorem 5.5.

From Lemma 5.4, we know that dim(ϕk((𝒢B)))=dim(k(𝒢B))dimensionsuperscriptitalic-ϕabsent𝑘subscriptsubscript𝒢𝐵dimensionsuperscriptabsent𝑘subscript𝒢𝐵\dim(\phi^{\leq k}(\mathcal{M}_{\infty}(\mathcal{G}_{B})))=\dim(\mathcal{M}^{% \leq k}(\mathcal{G}_{B}))roman_dim ( italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_M start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) ) = roman_dim ( caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ). Hence, it is enough to show that ϕk(𝒮(𝒢B))superscriptitalic-ϕabsent𝑘𝒮subscript𝒢𝐵\phi^{\leq k}(\mathcal{S}(\mathcal{G}_{B}))italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_S ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) lies in a subvariety of k(𝒢B)superscriptabsent𝑘subscript𝒢𝐵\mathcal{M}^{\leq k}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) of strictly smaller dimension, see e.g., Okamoto (1973, Lemma).

Notice that we can write

𝒮(𝒢B)=(i[p]𝒮i(𝒢B))(uiuj𝒢B𝒮ij(𝒢B)),𝒮subscript𝒢𝐵subscript𝑖delimited-[]𝑝subscript𝒮𝑖subscript𝒢𝐵subscriptsubscript𝑢𝑖subscript𝑢𝑗subscript𝒢𝐵subscript𝒮𝑖𝑗subscript𝒢𝐵\mathcal{S}(\mathcal{G}_{B})=\left(\bigcup_{i\in[p]}\mathcal{S}_{i}(\mathcal{G% }_{B})\right)\cup\left(\bigcup_{u_{i}\leftrightarrow{}u_{j}\in\mathcal{G}_{B}}% \mathcal{S}_{i\leftrightarrow{}j}(\mathcal{G}_{B})\right),caligraphic_S ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) = ( ⋃ start_POSTSUBSCRIPT italic_i ∈ [ italic_p ] end_POSTSUBSCRIPT caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) ∪ ( ⋃ start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ↔ italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) ,

where

κi(ε)={Aκ(ε):a1ia2i0},𝒮i(𝒢B)={ε𝒮(𝒢B):κi(ε)},formulae-sequencesubscript𝜅𝑖𝜀conditional-set𝐴𝜅𝜀subscript𝑎1𝑖subscript𝑎2𝑖0subscript𝒮𝑖subscript𝒢𝐵conditional-set𝜀𝒮subscript𝒢𝐵subscript𝜅𝑖𝜀\displaystyle\kappa_{i}(\varepsilon)=\{A\in\kappa(\varepsilon)\>:\>a_{1i}\cdot a% _{2i}\neq 0\},\qquad\mathcal{S}_{i}(\mathcal{G}_{B})=\{\varepsilon\in\mathcal{% S}(\mathcal{G}_{B})\>:\>\kappa_{i}(\varepsilon)\neq\emptyset\},italic_κ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_ε ) = { italic_A ∈ italic_κ ( italic_ε ) : italic_a start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_i end_POSTSUBSCRIPT ≠ 0 } , caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) = { italic_ε ∈ caligraphic_S ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) : italic_κ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_ε ) ≠ ∅ } ,

while κij(ε),𝒮ij(𝒢B)subscript𝜅𝑖𝑗𝜀subscript𝒮𝑖𝑗subscript𝒢𝐵\kappa_{i\leftrightarrow{}j}(\varepsilon),\mathcal{S}_{i\leftrightarrow{}j}(% \mathcal{G}_{B})italic_κ start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( italic_ε ) , caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) are defined in a similar way. Hence, it is enough to prove that both ϕk(𝒮i(𝒢B))superscriptitalic-ϕabsent𝑘subscript𝒮𝑖subscript𝒢𝐵\phi^{\leq k}(\mathcal{S}_{i}(\mathcal{G}_{B}))italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) and ϕk(𝒮ij(𝒢B))superscriptitalic-ϕabsent𝑘subscript𝒮𝑖𝑗subscript𝒢𝐵\phi^{\leq k}(\mathcal{S}_{i\leftrightarrow{}j}(\mathcal{G}_{B}))italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) are Lebesgue measure 0 subsets of k(𝒢B)superscriptabsent𝑘subscript𝒢𝐵\mathcal{M}^{\leq k}(\mathcal{G}_{B})caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) for k𝑘kitalic_k high enough.

We start with by bounding the dimension of 𝒮i(𝒢B)subscript𝒮𝑖subscript𝒢𝐵\mathcal{S}_{i}(\mathcal{G}_{B})caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). For every ε𝒮i(𝒢B)𝜀subscript𝒮𝑖subscript𝒢𝐵\varepsilon\in\mathcal{S}_{i}(\mathcal{G}_{B})italic_ε ∈ caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), every Aκi(ε)𝐴subscript𝜅𝑖𝜀A\in\kappa_{i}(\varepsilon)italic_A ∈ italic_κ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_ε ), and every 0s,tformulae-sequence0𝑠𝑡0\neq s,t\in\mathbb{N}0 ≠ italic_s , italic_t ∈ blackboard_N we can use Lemma B.1 to write

0=𝒞(s+t)(Aε)1,,1s,2,2t=𝒞(s+t)(ε)i,,ia1isa2it+ris+t(ε,A)0superscript𝒞𝑠𝑡subscript𝐴𝜀subscript11𝑠subscript22𝑡superscript𝒞𝑠𝑡subscript𝜀𝑖𝑖superscriptsubscript𝑎1𝑖𝑠subscriptsuperscript𝑎𝑡2𝑖subscriptsuperscript𝑟𝑠𝑡𝑖𝜀𝐴0=\mathcal{C}^{(s+t)}(A\cdot\varepsilon)_{\underbrace{1,\dots,1}_{s},% \underbrace{2\dots,2}_{t}}=\mathcal{C}^{(s+t)}(\varepsilon)_{i,\dots,i}a_{1i}^% {s}\cdot a^{t}_{2i}+r^{s+t}_{\setminus i}(\varepsilon,A)0 = caligraphic_C start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( italic_A ⋅ italic_ε ) start_POSTSUBSCRIPT under⏟ start_ARG 1 , … , 1 end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , under⏟ start_ARG 2 … , 2 end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT = caligraphic_C start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_i , … , italic_i end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ⋅ italic_a start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 italic_i end_POSTSUBSCRIPT + italic_r start_POSTSUPERSCRIPT italic_s + italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∖ italic_i end_POSTSUBSCRIPT ( italic_ε , italic_A )

where ris+t(ε,A)subscriptsuperscript𝑟𝑠𝑡𝑖𝜀𝐴r^{s+t}_{\setminus i}(\varepsilon,A)italic_r start_POSTSUPERSCRIPT italic_s + italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∖ italic_i end_POSTSUBSCRIPT ( italic_ε , italic_A ) is a non-zero polynomial in [(i,,i)(s+t)(𝒢B),ai,j:i,j[p]]\mathbb{R}[\mathcal{M}_{{}_{\setminus(i,\dots,i)}}^{(s+t)}(\mathcal{G}_{B}),a_% {i,j}\>:\>i,j\in[p]]blackboard_R [ caligraphic_M start_POSTSUBSCRIPT start_FLOATSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_FLOATSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) , italic_a start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT : italic_i , italic_j ∈ [ italic_p ] ], notice that for the first equality we used that (Aε)1subscript𝐴𝜀1(A\varepsilon)_{1}( italic_A italic_ε ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and (Aε)2subscript𝐴𝜀2(A\varepsilon)_{2}( italic_A italic_ε ) start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are independent. This implies that we can write

𝒞(s+t)(ε)i,,i=ϕk(ε)(i,,i)=ris+t(ϕk(ε)(i,,i),A)/ai1sai2t.superscript𝒞𝑠𝑡subscript𝜀𝑖𝑖superscriptitalic-ϕ𝑘subscript𝜀𝑖𝑖subscriptsuperscript𝑟𝑠𝑡𝑖superscriptitalic-ϕ𝑘subscript𝜀𝑖𝑖𝐴superscriptsubscript𝑎𝑖1𝑠subscriptsuperscript𝑎𝑡𝑖2\mathcal{C}^{(s+t)}(\varepsilon)_{i,\dots,i}=\phi^{k}(\varepsilon)_{(i,\dots,i% )}=-{r^{s+t}_{\setminus i}\left(\phi^{k}(\varepsilon)_{\setminus(i,\dots,i)},A% \right)}/{a_{i1}^{s}\cdot a^{t}_{i2}}.caligraphic_C start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_i , … , italic_i end_POSTSUBSCRIPT = italic_ϕ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT ( italic_i , … , italic_i ) end_POSTSUBSCRIPT = - italic_r start_POSTSUPERSCRIPT italic_s + italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∖ italic_i end_POSTSUBSCRIPT ( italic_ϕ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_POSTSUBSCRIPT , italic_A ) / italic_a start_POSTSUBSCRIPT italic_i 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ⋅ italic_a start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i 2 end_POSTSUBSCRIPT . (B.2)

We can define a rational map ψis,t:(i,,i)(s+t)(𝒢B)×2p(s+t)(𝒢B):subscriptsuperscript𝜓𝑠𝑡𝑖superscriptsubscript𝑖𝑖𝑠𝑡subscript𝒢𝐵superscript2𝑝superscript𝑠𝑡subscript𝒢𝐵\psi^{s,t}_{i}:\mathcal{M}_{{}_{\setminus(i,\dots,i)}}^{(s+t)}(\mathcal{G}_{B}% )\times\mathbb{R}^{2p}\to\mathcal{M}^{(s+t)}(\mathcal{G}_{B})italic_ψ start_POSTSUPERSCRIPT italic_s , italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT : caligraphic_M start_POSTSUBSCRIPT start_FLOATSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_FLOATSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) × blackboard_R start_POSTSUPERSCRIPT 2 italic_p end_POSTSUPERSCRIPT → caligraphic_M start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) in the following way

ψis,t(A,𝒞s+t)i1,,ik:={ris+t(𝒞(i,,i)s+t,A)/a1isa2it, if (i1,,ik)=(i,,i),𝒞i1,,iks+t,otherwise,assignsubscriptsuperscript𝜓𝑠𝑡𝑖subscript𝐴superscript𝒞𝑠𝑡subscript𝑖1subscript𝑖𝑘casessubscriptsuperscript𝑟𝑠𝑡𝑖subscriptsuperscript𝒞𝑠𝑡𝑖𝑖𝐴superscriptsubscript𝑎1𝑖𝑠subscriptsuperscript𝑎𝑡2𝑖 if subscript𝑖1subscript𝑖𝑘𝑖𝑖subscriptsuperscript𝒞𝑠𝑡subscript𝑖1subscript𝑖𝑘otherwise\psi^{s,t}_{i}(A,\mathcal{C}^{s+t})_{i_{1},\dots,i_{k}}:=\begin{cases}{-r^{s+t% }_{\setminus i}(\mathcal{C}^{s+t}_{\setminus(i,\dots,i)},A)}/{a_{1i}^{s}\cdot a% ^{t}_{2i}},&\emph{ if }(i_{1},\dots,i_{k})=(i,\dots,i),\\ \mathcal{C}^{s+t}_{i_{1},\dots,i_{k}},&\emph{otherwise},\end{cases}italic_ψ start_POSTSUPERSCRIPT italic_s , italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_A , caligraphic_C start_POSTSUPERSCRIPT italic_s + italic_t end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT := { start_ROW start_CELL - italic_r start_POSTSUPERSCRIPT italic_s + italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∖ italic_i end_POSTSUBSCRIPT ( caligraphic_C start_POSTSUPERSCRIPT italic_s + italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_POSTSUBSCRIPT , italic_A ) / italic_a start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ⋅ italic_a start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 italic_i end_POSTSUBSCRIPT , end_CELL start_CELL if ( italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( italic_i , … , italic_i ) , end_CELL end_ROW start_ROW start_CELL caligraphic_C start_POSTSUPERSCRIPT italic_s + italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT , end_CELL start_CELL otherwise , end_CELL end_ROW

see, e.g., Cox et al. (2015, §5) for the definition of rational map.

What Eq. B.2 shows is that

ϕ(s+t)(𝒮i(𝒢B))ψ(s,t)((i,,i)(s+t)(𝒢B)×2p)(s+t)(𝒢B).superscriptitalic-ϕ𝑠𝑡subscript𝒮𝑖subscript𝒢𝐵superscript𝜓𝑠𝑡superscriptsubscript𝑖𝑖𝑠𝑡subscript𝒢𝐵superscript2𝑝superscript𝑠𝑡subscript𝒢𝐵\phi^{(s+t)}(\mathcal{S}_{i}(\mathcal{G}_{B}))\subseteq\psi^{(s,t)}(\mathcal{M% }_{{}_{\setminus(i,\dots,i)}}^{(s+t)}(\mathcal{G}_{B})\times\mathbb{R}^{2p})% \subseteq\mathcal{M}^{(s+t)}(\mathcal{G}_{B}).italic_ϕ start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) ⊆ italic_ψ start_POSTSUPERSCRIPT ( italic_s , italic_t ) end_POSTSUPERSCRIPT ( caligraphic_M start_POSTSUBSCRIPT start_FLOATSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_FLOATSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) × blackboard_R start_POSTSUPERSCRIPT 2 italic_p end_POSTSUPERSCRIPT ) ⊆ caligraphic_M start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) .

Let’s consider ψik:(i,,i)k(𝒢B)×2pk(𝒢B):subscriptsuperscript𝜓absent𝑘𝑖superscriptsubscript𝑖𝑖absent𝑘subscript𝒢𝐵superscript2𝑝superscriptabsent𝑘subscript𝒢𝐵\psi^{\leq k}_{i}:\mathcal{M}_{{}_{\setminus(i,\dots,i)}}^{\leq k}(\mathcal{G}% _{B})\times\mathbb{R}^{2p}\to\mathcal{M}^{\leq k}(\mathcal{G}_{B})italic_ψ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT : caligraphic_M start_POSTSUBSCRIPT start_FLOATSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_FLOATSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) × blackboard_R start_POSTSUPERSCRIPT 2 italic_p end_POSTSUPERSCRIPT → caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) such that ψikπk0=ψik01,1subscriptsuperscript𝜓absent𝑘𝑖subscript𝜋subscript𝑘0subscriptsuperscript𝜓subscript𝑘011𝑖\psi^{\leq k}_{i}\circ\pi_{k_{0}}=\psi^{k_{0}-1,1}_{i}italic_ψ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∘ italic_π start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_ψ start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - 1 , 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, where πk0subscript𝜋subscript𝑘0\pi_{k_{0}}italic_π start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is the projection of ksuperscriptabsent𝑘\mathcal{M}^{\leq k}caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT onto (k0)superscriptsubscript𝑘0\mathcal{M}^{(k_{0})}caligraphic_M start_POSTSUPERSCRIPT ( italic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT for every k0ksubscript𝑘0𝑘k_{0}\leq kitalic_k start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ italic_k. Again, we have ϕk(𝒮i(𝒢B))ψk((i,,i)k(𝒢B)×2p)k(𝒢B)superscriptitalic-ϕabsent𝑘subscript𝒮𝑖subscript𝒢𝐵superscript𝜓absent𝑘superscriptsubscript𝑖𝑖absent𝑘subscript𝒢𝐵superscript2𝑝superscriptabsent𝑘subscript𝒢𝐵\phi^{\leq k}(\mathcal{S}_{i}(\mathcal{G}_{B}))\subseteq\psi^{\leq k}(\mathcal% {M}_{{}_{\setminus(i,\dots,i)}}^{\leq k}(\mathcal{G}_{B})\times\mathbb{R}^{2p}% )\subseteq\mathcal{M}^{\leq k}(\mathcal{G}_{B})italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) ⊆ italic_ψ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_M start_POSTSUBSCRIPT start_FLOATSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_FLOATSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) × blackboard_R start_POSTSUPERSCRIPT 2 italic_p end_POSTSUPERSCRIPT ) ⊆ caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), that concludes the proof by noticing that

dim(ϕk(𝒮i(𝒢B)))dim(ψk((i,,i)k(𝒢B)×2p))dimensionsuperscriptitalic-ϕabsent𝑘subscript𝒮𝑖subscript𝒢𝐵dimensionsuperscript𝜓absent𝑘superscriptsubscript𝑖𝑖absent𝑘subscript𝒢𝐵superscript2𝑝\displaystyle\dim(\phi^{\leq k}(\mathcal{S}_{i}(\mathcal{G}_{B})))\leq\dim% \bigg{(}\psi^{\leq k}(\mathcal{M}_{{}_{\setminus(i,\dots,i)}}^{\leq k}(% \mathcal{G}_{B})\times\mathbb{R}^{2p})\bigg{)}roman_dim ( italic_ϕ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) ) ≤ roman_dim ( italic_ψ start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_M start_POSTSUBSCRIPT start_FLOATSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_FLOATSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) × blackboard_R start_POSTSUPERSCRIPT 2 italic_p end_POSTSUPERSCRIPT ) )
dim(2p)+dim((i,,i)k(𝒢B))2p+dim(k(𝒢B))(k1),absentdimensionsuperscript2𝑝dimensionsuperscriptsubscript𝑖𝑖absent𝑘subscript𝒢𝐵2𝑝dimensionsuperscriptabsent𝑘subscript𝒢𝐵𝑘1\displaystyle\leq\dim(\mathbb{R}^{2p})+\dim(\mathcal{M}_{{}_{\setminus(i,\dots% ,i)}}^{\leq k}(\mathcal{G}_{B}))\leq 2p+\dim(\mathcal{M}^{\leq k}(\mathcal{G}_% {B}))-(k-1),≤ roman_dim ( blackboard_R start_POSTSUPERSCRIPT 2 italic_p end_POSTSUPERSCRIPT ) + roman_dim ( caligraphic_M start_POSTSUBSCRIPT start_FLOATSUBSCRIPT ∖ ( italic_i , … , italic_i ) end_FLOATSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) ≤ 2 italic_p + roman_dim ( caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) - ( italic_k - 1 ) ,

that is strictly smaller than dim(k(𝒢B))dimensionsuperscriptabsent𝑘subscript𝒢𝐵\dim(\mathcal{M}^{\leq k}(\mathcal{G}_{B}))roman_dim ( caligraphic_M start_POSTSUPERSCRIPT ≤ italic_k end_POSTSUPERSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) if k2(p+1)𝑘2𝑝1k\geq 2(p+1)italic_k ≥ 2 ( italic_p + 1 ).

In order to prove the result for 𝒮ij(𝒢B)subscript𝒮𝑖𝑗subscript𝒢𝐵\mathcal{S}_{i\leftrightarrow{}j}(\mathcal{G}_{B})caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), we first notice that we can always write

𝒮ij(𝒢B)=(𝒮ij(𝒢B)i[p]𝒮i(𝒢B))˙(𝒮ij(𝒢B)i[p]𝒮i(𝒢B)).subscript𝒮𝑖𝑗subscript𝒢𝐵subscript𝒮𝑖𝑗subscript𝒢𝐵subscript𝑖delimited-[]𝑝subscript𝒮𝑖subscript𝒢𝐵˙subscript𝒮𝑖𝑗subscript𝒢𝐵subscript𝑖delimited-[]𝑝subscript𝒮𝑖subscript𝒢𝐵\mathcal{S}_{i\leftrightarrow{}j}(\mathcal{G}_{B})=\left(\mathcal{S}_{i% \leftrightarrow{}j}(\mathcal{G}_{B})\cap\bigcup_{i\in[p]}\mathcal{S}_{i}(% \mathcal{G}_{B})\right)\dot{\cup}\left(\mathcal{S}_{i\leftrightarrow{}j}(% \mathcal{G}_{B})\setminus\bigcup_{i\in[p]}\mathcal{S}_{i}(\mathcal{G}_{B})% \right).caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) = ( caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ∩ ⋃ start_POSTSUBSCRIPT italic_i ∈ [ italic_p ] end_POSTSUBSCRIPT caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) over˙ start_ARG ∪ end_ARG ( caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ∖ ⋃ start_POSTSUBSCRIPT italic_i ∈ [ italic_p ] end_POSTSUBSCRIPT caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ) .

Since we have already bounded the dimension of 𝒮ij(𝒢B)i[p]𝒮i(𝒢B)subscript𝒮𝑖𝑗subscript𝒢𝐵subscript𝑖delimited-[]𝑝subscript𝒮𝑖subscript𝒢𝐵\mathcal{S}_{i\leftrightarrow{}j}(\mathcal{G}_{B})\cap\bigcup_{i\in[p]}% \mathcal{S}_{i}(\mathcal{G}_{B})caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ∩ ⋃ start_POSTSUBSCRIPT italic_i ∈ [ italic_p ] end_POSTSUBSCRIPT caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ); to conclude the proof we only need to bound the dimension of

𝒮~ij(𝒢B):=𝒮ij(𝒢B)i[p]𝒮i(𝒢B).assignsubscript~𝒮𝑖𝑗subscript𝒢𝐵subscript𝒮𝑖𝑗subscript𝒢𝐵subscript𝑖delimited-[]𝑝subscript𝒮𝑖subscript𝒢𝐵\tilde{\mathcal{S}}_{i\leftrightarrow{}j}(\mathcal{G}_{B}):=\mathcal{S}_{i% \leftrightarrow{}j}(\mathcal{G}_{B})\setminus\bigcup_{i\in[p]}\mathcal{S}_{i}(% \mathcal{G}_{B}).over~ start_ARG caligraphic_S end_ARG start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) := caligraphic_S start_POSTSUBSCRIPT italic_i ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ∖ ⋃ start_POSTSUBSCRIPT italic_i ∈ [ italic_p ] end_POSTSUBSCRIPT caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) .

For every ε𝒮~j(𝒢B)𝜀subscript~𝒮absent𝑗subscript𝒢𝐵\varepsilon\in\tilde{\mathcal{S}}_{\leftrightarrow{}j}(\mathcal{G}_{B})italic_ε ∈ over~ start_ARG caligraphic_S end_ARG start_POSTSUBSCRIPT ↔ italic_j end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ), and any Aκj(ε)𝐴subscript𝜅absent𝑗𝜀A\in\kappa_{\leftrightarrow{}j}(\varepsilon)italic_A ∈ italic_κ start_POSTSUBSCRIPT ↔ italic_j end_POSTSUBSCRIPT ( italic_ε ), and every 2k2𝑘2\leq k\in\mathbb{N}2 ≤ italic_k ∈ blackboard_N we can use Lemma B.1 to write

0=𝒞(k)(Aε)1,,1,2=𝒞(s+t)(ε)i,,i,ja1ik1a2j+rijk(ε,A),0superscript𝒞𝑘subscript𝐴𝜀112superscript𝒞𝑠𝑡subscript𝜀𝑖𝑖𝑗superscriptsubscript𝑎1𝑖𝑘1subscript𝑎2𝑗subscriptsuperscript𝑟𝑘𝑖𝑗𝜀𝐴0=\mathcal{C}^{(k)}(A\cdot\varepsilon)_{1,\dots,1,2}=\mathcal{C}^{(s+t)}(% \varepsilon)_{i,\dots,i,j}a_{1i}^{k-1}\cdot a_{2j}+r^{k}_{\setminus i% \leftrightarrow{}j}(\varepsilon,A),0 = caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_A ⋅ italic_ε ) start_POSTSUBSCRIPT 1 , … , 1 , 2 end_POSTSUBSCRIPT = caligraphic_C start_POSTSUPERSCRIPT ( italic_s + italic_t ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_i , … , italic_i , italic_j end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_j end_POSTSUBSCRIPT + italic_r start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∖ italic_i ↔ italic_j end_POSTSUBSCRIPT ( italic_ε , italic_A ) ,

where we used that a1ia2i=0subscript𝑎1𝑖subscript𝑎2𝑖0a_{1i}\cdot a_{2i}=0italic_a start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_i end_POSTSUBSCRIPT = 0 for every i𝑖iitalic_i, that is a consequence of εi[p]𝒮i(𝒢B)𝜀subscript𝑖delimited-[]𝑝subscript𝒮𝑖subscript𝒢𝐵\varepsilon\notin\cup_{i\in[p]}\mathcal{S}_{i}(\mathcal{G}_{B})italic_ε ∉ ∪ start_POSTSUBSCRIPT italic_i ∈ [ italic_p ] end_POSTSUBSCRIPT caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) to simplify the formula given in Lemma B.1. This allows us to write

𝒞(k)(ε)i,,i,j=ϕk(ε)(i,,i,j)=rijk(ϕk(ε)(1,,1,2),A)/a1ik1a2j.superscript𝒞𝑘subscript𝜀𝑖𝑖𝑗superscriptitalic-ϕ𝑘subscript𝜀𝑖𝑖𝑗subscriptsuperscript𝑟𝑘𝑖𝑗superscriptitalic-ϕ𝑘subscript𝜀112𝐴superscriptsubscript𝑎1𝑖𝑘1subscript𝑎2𝑗\mathcal{C}^{(k)}(\varepsilon)_{i,\dots,i,j}=\phi^{k}(\varepsilon)_{(i,\dots,i% ,j)}=-{r^{k}_{\setminus i\leftrightarrow{}j}\left(\phi^{k}(\varepsilon)_{% \setminus(1,\dots,1,2)},A\right)}/{a_{1i}^{k-1}\cdot a_{2j}}.caligraphic_C start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT italic_i , … , italic_i , italic_j end_POSTSUBSCRIPT = italic_ϕ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT ( italic_i , … , italic_i , italic_j ) end_POSTSUBSCRIPT = - italic_r start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ∖ italic_i ↔ italic_j end_POSTSUBSCRIPT ( italic_ϕ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_ε ) start_POSTSUBSCRIPT ∖ ( 1 , … , 1 , 2 ) end_POSTSUBSCRIPT , italic_A ) / italic_a start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ⋅ italic_a start_POSTSUBSCRIPT 2 italic_j end_POSTSUBSCRIPT .

The rest of the proof follows verbatim the case of 𝒮i(𝒢B)subscript𝒮𝑖subscript𝒢𝐵\mathcal{S}_{i}(\mathcal{G}_{B})caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ). ∎

B.3 Proofs for Section 7

Lemma B.2.

Let 𝒢=(V,E,E)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\rightarrow{}})caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT ) be a mixed graph. Assume the vertex set can be partitioned as V=C1˙˙Cn𝑉subscript𝐶1˙˙subscript𝐶𝑛V=C_{1}\dot{\cup}\cdots\dot{\cup}C_{n}italic_V = italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT over˙ start_ARG ∪ end_ARG ⋯ over˙ start_ARG ∪ end_ARG italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, with Cisubscript𝐶𝑖C_{i}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT being a kisubscript𝑘𝑖k_{i}italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT-cycle, and pa(Ci)j=0iCjpasubscript𝐶𝑖superscriptsubscript𝑗0𝑖subscript𝐶𝑗\mathop{\rm pa}\nolimits(C_{i})\subseteq\bigcup_{j=0}^{i}C_{j}roman_pa ( italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ⊆ ⋃ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, where ˙˙\dot{\cup}over˙ start_ARG ∪ end_ARG denotes the union of disjoint sets. Then, 𝒢𝒢\mathcal{G}caligraphic_G is generically identifiable if and only if λCi,vsubscript𝜆subscript𝐶𝑖𝑣\lambda_{C_{i},v}italic_λ start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v end_POSTSUBSCRIPT is identifiable for every i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ] and vCi𝑣subscript𝐶𝑖v\in C_{i}italic_v ∈ italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and the graphical criterion in Theorem 7.2 is satisfied.

Proof.

If the matrix ΛΛ\Lambdaroman_Λ is identifiable, then by definition, all of its columns are also identifiable, and from Theorem 7.2, we know that the graphical condition is satisfied. We now prove that the reverse implication is also true.

By plugging in λCi,vsubscript𝜆subscript𝐶𝑖𝑣\lambda_{C_{i},v}italic_λ start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v end_POSTSUBSCRIPT instead of λ~Ci,vsubscript~𝜆subscript𝐶𝑖𝑣\tilde{\lambda}_{C_{i},v}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v end_POSTSUBSCRIPT in Eq. 3.2, one can see that the matrix A𝐴Aitalic_A has the following shape

[Ik100AC2,C1Ik20ACn,C1ACn,C2Ikn].matrixsubscript𝐼subscript𝑘100subscript𝐴subscript𝐶2subscript𝐶1subscript𝐼subscript𝑘20subscript𝐴subscript𝐶𝑛subscript𝐶1subscript𝐴subscript𝐶𝑛subscript𝐶2subscript𝐼subscript𝑘𝑛\begin{bmatrix}I_{k_{1}}&0&\cdots&0\\ A_{C_{2},C_{1}}&I_{k_{2}}&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ A_{C_{n},C_{1}}&A_{C_{n},C_{2}}&\cdots&I_{k_{n}}\\ \end{bmatrix}.[ start_ARG start_ROW start_CELL italic_I start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_A start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_I start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_A start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_A start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_I start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] .

In particular, we have av,v=1subscript𝑎𝑣𝑣1a_{v,v}=1italic_a start_POSTSUBSCRIPT italic_v , italic_v end_POSTSUBSCRIPT = 1 for every vV𝑣𝑉v\in Vitalic_v ∈ italic_V. The same proof as in Lemma 3.2 applies. ∎

Theorem B.3 (Theorem 8.4).

Let 𝒢=(V,E,E=)𝒢𝑉subscript𝐸subscript𝐸\mathcal{G}=(V,E_{\rightarrow{}},E_{\leftrightarrow{}}=\emptyset)caligraphic_G = ( italic_V , italic_E start_POSTSUBSCRIPT → end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT = ∅ ) be a directed graph such that V=C1˙˙Cn𝑉subscript𝐶1˙˙subscript𝐶𝑛V=C_{1}\dot{\cup}\cdots\dot{\cup}C_{n}italic_V = italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT over˙ start_ARG ∪ end_ARG ⋯ over˙ start_ARG ∪ end_ARG italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, with Cisubscript𝐶𝑖C_{i}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT being a kisubscript𝑘𝑖k_{i}italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT-cycle, and pa(Ci)j=0iCjpasubscript𝐶𝑖superscriptsubscript𝑗0𝑖subscript𝐶𝑗\mathop{\rm pa}\nolimits(C_{i})\subseteq\bigcup_{j=0}^{i}C_{j}roman_pa ( italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ⊆ ⋃ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Then 𝒢𝒢\mathcal{G}caligraphic_G is generically identifiable if and only if for every cycle C={v1,v2}𝐶subscript𝑣1subscript𝑣2C=\{v_{1},v_{2}\}italic_C = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } of size 2222, we have pa(C)C=pa(vi)Cpa𝐶𝐶pasubscript𝑣𝑖𝐶\mathop{\rm pa}\nolimits(C)\setminus C=\mathop{\rm pa}\nolimits(v_{i})\setminus Croman_pa ( italic_C ) ∖ italic_C = roman_pa ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∖ italic_C for i{1,2}𝑖12i\in\{1,2\}italic_i ∈ { 1 , 2 }.

Proof of Theorem 7.4.

We know from Lemma 7.3 that if ki2subscript𝑘𝑖2k_{i}\neq 2italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≠ 2 then λCi,vsubscript𝜆subscript𝐶𝑖𝑣\lambda_{C_{i},v}italic_λ start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v end_POSTSUBSCRIPT is identifiable. If the set S={i[n]:ki=2}𝑆conditional-set𝑖delimited-[]𝑛subscript𝑘𝑖2S=\{i\in[n]\>:\>k_{i}=2\}italic_S = { italic_i ∈ [ italic_n ] : italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 2 } is empty then we know from Lemma B.2 and the fact that E=subscript𝐸E_{\leftrightarrow{}}=\emptysetitalic_E start_POSTSUBSCRIPT ↔ end_POSTSUBSCRIPT = ∅ that ΛΛ\Lambdaroman_Λ is identifiable. Otherwise, let m=minS𝑚𝑆m=\min Sitalic_m = roman_min italic_S and Cm={v1,v2}subscript𝐶𝑚subscript𝑣1subscript𝑣2C_{m}=\{v_{1},v_{2}\}italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }.

We know from Example 7.2 that we can choose (λ~v1v2,λ~v2v1)=(bv2v2/bv1v2,bv1v1/bv2v1)subscript~𝜆subscript𝑣1subscript𝑣2subscript~𝜆subscript𝑣2subscript𝑣1subscript𝑏subscript𝑣2subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣1subscript𝑏subscript𝑣2subscript𝑣1(\tilde{\lambda}_{v_{1}v_{2}},\tilde{\lambda}_{v_{2}v_{1}})=({b_{v_{2}v_{2}}}/% {b_{v_{1}v_{2}}},{b_{v_{1}v_{1}}}/{b_{v_{2}v_{1}}})( over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = ( italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT / italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT / italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ). If m=1𝑚1m=1italic_m = 1, letting λ~u,v=λu,vsubscript~𝜆𝑢𝑣subscript𝜆𝑢𝑣\tilde{\lambda}_{u,v}=\lambda_{u,v}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT = italic_λ start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT for vC1𝑣subscript𝐶1v\notin C_{1}italic_v ∉ italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, the matrix A𝐴Aitalic_A of Eq. 5.4 will have the following shape

[0det((BΛ){v1,v2},{v1,v2})/bv2v100det((BΛ){v1,v2},{v1,v2})/bv1v200000Ik20000Ikn],matrix0subscriptsubscript𝐵Λsubscript𝑣1subscript𝑣2subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣2subscript𝑣100subscriptsubscript𝐵Λsubscript𝑣1subscript𝑣2subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣200000subscript𝐼subscript𝑘20000subscript𝐼subscript𝑘𝑛\begin{bmatrix}0&\det((B_{\Lambda})_{\{v_{1},v_{2}\},\{v_{1},v_{2}\}})/b_{v_{2% }v_{1}}&0&\cdots&0\\ \det((B_{\Lambda})_{\{v_{1},v_{2}\},\{v_{1},v_{2}\}})/b_{v_{1}v_{2}}&0&0&% \cdots&0\\ 0&0&I_{k_{2}}&\cdots&0\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ 0&0&0&\cdots&I_{k_{n}}\\ \end{bmatrix},[ start_ARG start_ROW start_CELL 0 end_CELL start_CELL roman_det ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } end_POSTSUBSCRIPT ) / italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL roman_det ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } end_POSTSUBSCRIPT ) / italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_I start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL italic_I start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] ,

that satisfies all the constraints imposed by 1. Proving that ΛΛ\Lambdaroman_Λ is not identifiable in this case.

If m>1𝑚1m>1italic_m > 1, we know that λCi,vsubscript𝜆subscript𝐶𝑖𝑣\lambda_{C_{i},v}italic_λ start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v end_POSTSUBSCRIPT is identifiable for every i<m𝑖𝑚i<mitalic_i < italic_m, hence the matrix A𝐴Aitalic_A will be as follows

[Ik1000Ik20Av1,C1Av1,C20det(BΛ){v1,v2},{v1,v2}/bv2v10Av2,C1Av2,C2det(BΛ){v1,v2},{v1,v2}/bv1v200ACn,C1ACn,C2ACn,Cn].matrixsubscript𝐼subscript𝑘1000subscript𝐼subscript𝑘20subscript𝐴subscript𝑣1subscript𝐶1subscript𝐴subscript𝑣1subscript𝐶20subscriptsubscript𝐵Λsubscript𝑣1subscript𝑣2subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣2subscript𝑣10subscript𝐴subscript𝑣2subscript𝐶1subscript𝐴subscript𝑣2subscript𝐶2subscriptsubscript𝐵Λsubscript𝑣1subscript𝑣2subscript𝑣1subscript𝑣2subscript𝑏subscript𝑣1subscript𝑣200subscript𝐴subscript𝐶𝑛subscript𝐶1subscript𝐴subscript𝐶𝑛subscript𝐶2subscript𝐴subscript𝐶𝑛subscript𝐶𝑛\begin{bmatrix}I_{k_{1}}&0&\cdots&\cdots&\cdots&\cdots&0\\ 0&I_{k_{2}}&\cdots&\cdots&\cdots&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\ddots&\vdots&\vdots\\ A_{v_{1},C_{1}}&A_{v_{1},C_{2}}&\cdots&0&\det(B_{\Lambda})_{\{v_{1},v_{2}\},\{% v_{1},v_{2}\}}/b_{v_{2}v_{1}}&\cdots&0\\ A_{v_{2},C_{1}}&A_{v_{2},C_{2}}&\cdots&\det(B_{\Lambda})_{\{v_{1},v_{2}\},\{v_% {1},v_{2}\}}/b_{v_{1}v_{2}}&0&\cdots&0\\ \vdots&\vdots&\vdots&\ddots&\ddots&\ddots&\vdots\\ A_{C_{n},C_{1}}&A_{C_{n},C_{2}}&\cdots&\cdots&\cdots&\cdots&A_{C_{n},C_{n}}% \end{bmatrix}.[ start_ARG start_ROW start_CELL italic_I start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_A start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_A start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL start_CELL roman_det ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } end_POSTSUBSCRIPT / italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_A start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_A start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL roman_det ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } end_POSTSUBSCRIPT / italic_b start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_A start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL italic_A start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL italic_A start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] .

This implies that in order for the matrix A𝐴Aitalic_A to satisfy the conditions in 1 for every pair of nodes, we must have ACm,an(Cm)Cm=0subscript𝐴subscript𝐶𝑚ansubscript𝐶𝑚subscript𝐶𝑚0A_{C_{m},\mathop{\rm an}\nolimits(C_{m})\setminus C_{m}}=0italic_A start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , roman_an ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ∖ italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 0. This might happen if and only if ACm,pa(Cm)=0subscript𝐴subscript𝐶𝑚superscriptpasubscript𝐶𝑚0A_{C_{m},\mathop{\rm pa}\nolimits^{*}(C_{m})}=0italic_A start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT = 0, where pa(Cm)=pa(Cm)Cmsuperscriptpasubscript𝐶𝑚pasubscript𝐶𝑚subscript𝐶𝑚\mathop{\rm pa}\nolimits^{*}(C_{m})=\mathop{\rm pa}\nolimits(C_{m})\setminus C% _{m}roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = roman_pa ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ∖ italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT. Writing Av1,pa(Cm)=0subscript𝐴subscript𝑣1superscriptpasubscript𝐶𝑚0A_{v_{1},\mathop{\rm pa}\nolimits^{*}(C_{m})}=0italic_A start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT = 0 explicitly we get to the following linear system

(BΛ)pa(v1){v2},pa(Cm)Tλ~pa(v1){v2},v1=(BΛ)v1,pa(Cm)T1λv2v1(BΛ)v2,pa(Cm)T.superscriptsubscriptsubscript𝐵Λpasubscript𝑣1subscript𝑣2superscriptpasubscript𝐶𝑚𝑇subscript~𝜆pasubscript𝑣1subscript𝑣2subscript𝑣1superscriptsubscriptsubscript𝐵Λsubscript𝑣1superscriptpasubscript𝐶𝑚𝑇1subscript𝜆subscript𝑣2subscript𝑣1superscriptsubscriptsubscript𝐵Λsubscript𝑣2superscriptpasubscript𝐶𝑚𝑇(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{1})\setminus\{v_{2}\},\mathop{\rm pa% }\nolimits^{*}(C_{m})}^{T}\cdot\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v_{1}% )\setminus\{v_{2}\},v_{1}}=(B_{\Lambda})_{v_{1},\mathop{\rm pa}\nolimits^{*}(C% _{m})}^{T}-\frac{1}{\lambda_{v_{2}v_{1}}}(B_{\Lambda})_{v_{2},\mathop{\rm pa}% \nolimits^{*}(C_{m})}^{T}.( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . (B.3)

We know that the system

(BΛ)pa(v1){v2},pa(Cm)Tλ~pa(v1){v2},v1=(BΛ)v1,pa(Cm)Tsuperscriptsubscriptsubscript𝐵Λpasubscript𝑣1subscript𝑣2superscriptpasubscript𝐶𝑚𝑇subscript~𝜆pasubscript𝑣1subscript𝑣2subscript𝑣1superscriptsubscriptsubscript𝐵Λsubscript𝑣1superscriptpasubscript𝐶𝑚𝑇(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{1})\setminus\{v_{2}\},\mathop{\rm pa% }\nolimits^{*}(C_{m})}^{T}\cdot\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v_{1}% )\setminus\{v_{2}\},v_{1}}=(B_{\Lambda})_{v_{1},\mathop{\rm pa}\nolimits^{*}(C% _{m})}^{T}( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT

has always a solution given by λpa(v1){v2}subscript𝜆pasubscript𝑣1subscript𝑣2\lambda_{\mathop{\rm pa}\nolimits(v_{1})\setminus\{v_{2}\}}italic_λ start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } end_POSTSUBSCRIPT. Hence, the system in Eq. B.3 has a solution if and only if the system

(BΛ)pa(v1){v2},pa(Cm)Tλ~pa(v1){v2},v1=(BΛ)v2,pa(Cm)Tsuperscriptsubscriptsubscript𝐵Λpasubscript𝑣1subscript𝑣2superscriptpasubscript𝐶𝑚𝑇subscript~𝜆pasubscript𝑣1subscript𝑣2subscript𝑣1superscriptsubscriptsubscript𝐵Λsubscript𝑣2superscriptpasubscript𝐶𝑚𝑇(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{1})\setminus\{v_{2}\},\mathop{\rm pa% }\nolimits^{*}(C_{m})}^{T}\cdot\tilde{\lambda}_{\mathop{\rm pa}\nolimits(v_{1}% )\setminus\{v_{2}\},v_{1}}=(B_{\Lambda})_{v_{2},\mathop{\rm pa}\nolimits^{*}(C% _{m})}^{T}( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT (B.4)

has one. Using BΛ=(IΛ)Tsubscript𝐵Λsuperscript𝐼Λ𝑇B_{\Lambda}=(I-\Lambda)^{-T}italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT = ( italic_I - roman_Λ ) start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT and λv1,pa(Cm)=0subscript𝜆subscript𝑣1superscriptpasubscript𝐶𝑚0\lambda_{v_{1},\mathop{\rm pa}\nolimits^{*}(C_{m})=0}italic_λ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) = 0 end_POSTSUBSCRIPT, we can write

(BΛ)v2,pa(Cm)T=(BΛ)pa(v2){v1},pa(Cm)Tλpa(v2){v1},v2.superscriptsubscriptsubscript𝐵Λsubscript𝑣2superscriptpasubscript𝐶𝑚𝑇superscriptsubscriptsubscript𝐵Λpasubscript𝑣2subscript𝑣1superscriptpasubscript𝐶𝑚𝑇subscript𝜆pasubscript𝑣2subscript𝑣1subscript𝑣2(B_{\Lambda})_{v_{2},\mathop{\rm pa}\nolimits^{*}(C_{m})}^{T}=(B_{\Lambda})_{% \mathop{\rm pa}\nolimits(v_{2})\setminus\{v_{1}\},\mathop{\rm pa}\nolimits^{*}% (C_{m})}^{T}\cdot\lambda_{\mathop{\rm pa}\nolimits(v_{2})\setminus\{v_{1}\},v_% {2}}.( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT = ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ italic_λ start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

This implies that that the system in Eq. B.4 has solutions for a generic choice of λpa(v2){v1},v2subscript𝜆pasubscript𝑣2subscript𝑣1subscript𝑣2\lambda_{\mathop{\rm pa}\nolimits(v_{2})\setminus\{v_{1}\},v_{2}}italic_λ start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT if and only if the row space of (BΛ)pa(v1){v2},pa(Cm)subscriptsubscript𝐵Λpasubscript𝑣1subscript𝑣2superscriptpasubscript𝐶𝑚(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{1})\setminus\{v_{2}\},\mathop{\rm pa% }\nolimits^{*}(C_{m})}( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT contains the row space of (BΛ)pa(v2){v1},pa(Cm)subscriptsubscript𝐵Λpasubscript𝑣2subscript𝑣1superscriptpasubscript𝐶𝑚(B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{2})\setminus\{v_{1}\},\mathop{\rm pa% }\nolimits^{*}(C_{m})}( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT. That is, if

rank((BΛ)pa(v1){v2},pa(Cm))=rank((BΛ)pa(Cm),pa(Cm)).ranksubscriptsubscript𝐵Λpasubscript𝑣1subscript𝑣2superscriptpasubscript𝐶𝑚ranksubscriptsubscript𝐵Λsuperscriptpasubscript𝐶𝑚superscriptpasubscript𝐶𝑚\operatorname{rank}((B_{\Lambda})_{\mathop{\rm pa}\nolimits(v_{1})\setminus\{v% _{2}\},\mathop{\rm pa}\nolimits^{*}(C_{m})})=\operatorname{rank}((B_{\Lambda})% _{\mathop{\rm pa}\nolimits^{*}(C_{m}),\mathop{\rm pa}\nolimits^{*}(C_{m})}).roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∖ { italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ) = roman_rank ( ( italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) , roman_pa start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ) .

From Lemma A.2, one can see that this is possible if and only if the graphical condition of the theorem is satisfied. ∎

B.4 Proofs for Section 8.2

Proof of Lemma 8.1.

Let us denote the value of the objective function in the optimization problem of Eq. 8.1 for a matrix Λ~𝒢D~Λsuperscriptsubscript𝒢𝐷\tilde{\Lambda}\in\mathbb{R}^{\mathcal{G}_{D}}over~ start_ARG roman_Λ end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUPERSCRIPT by O(Λ~)𝑂~ΛO(\tilde{\Lambda})italic_O ( over~ start_ARG roman_Λ end_ARG ). By definition of the map Φ𝒢subscriptΦ𝒢\Phi_{\mathcal{G}}roman_Φ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT we have (IΛ)TX=εsuperscript𝐼Λ𝑇𝑋𝜀(I-\Lambda)^{T}\cdot X=\varepsilon( italic_I - roman_Λ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⋅ italic_X = italic_ε, that implies O(Λ)=0𝑂Λ0O(\Lambda)=0italic_O ( roman_Λ ) = 0. Hence, Λ~~Λ\tilde{\Lambda}over~ start_ARG roman_Λ end_ARG minimizes Eq. 8.1 if and only if O(Λ~)=0𝑂~Λ0O(\tilde{\Lambda})=0italic_O ( over~ start_ARG roman_Λ end_ARG ) = 0, that is if and only if

ε~=(IΛ~)X=(IΛ~)BΛε=Aε(𝒢B),~𝜀𝐼~Λ𝑋𝐼~Λsubscript𝐵Λ𝜀𝐴𝜀subscript𝒢𝐵\tilde{\varepsilon}=(I-\tilde{\Lambda})\cdot X=(I-\tilde{\Lambda})B_{\Lambda}% \cdot\varepsilon=A\cdot\varepsilon\in\mathcal{M}(\mathcal{G}_{B}),over~ start_ARG italic_ε end_ARG = ( italic_I - over~ start_ARG roman_Λ end_ARG ) ⋅ italic_X = ( italic_I - over~ start_ARG roman_Λ end_ARG ) italic_B start_POSTSUBSCRIPT roman_Λ end_POSTSUBSCRIPT ⋅ italic_ε = italic_A ⋅ italic_ε ∈ caligraphic_M ( caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ,

and we know from Lemma 3.2, that this is the case if and only if Λ~~Λ\tilde{\Lambda}over~ start_ARG roman_Λ end_ARG satisfies Eq. 3.3. ∎

Appendix C Details for Experiments

C.1 Data Generation

Identification.

For fixed p𝑝pitalic_p and e𝑒eitalic_e, the ADMG for the experiments in Section 8.1 are generated as follows:

  1. 1.

    sample a random integer edsubscript𝑒𝑑e_{d}italic_e start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT in {1,,e}1𝑒\{1,\dots,e\}{ 1 , … , italic_e },

  2. 2.

    let 𝒢Dsubscript𝒢𝐷\mathcal{G}_{D}caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT be a randomly generated DAG with p𝑝pitalic_p nodes and edsubscript𝑒𝑑e_{d}italic_e start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT edges,

  3. 3.

    let 𝒢Bsubscript𝒢𝐵\mathcal{G}_{B}caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT be a randomly generated undirected graph with p𝑝pitalic_p nodes and eed𝑒subscript𝑒𝑑e-e_{d}italic_e - italic_e start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT edges,

  4. 4.

    define 𝒢𝒢\mathcal{G}caligraphic_G as ([p],𝒢D,𝒢B)delimited-[]𝑝subscript𝒢𝐷subscript𝒢𝐵([p],\mathcal{G}_{D},\mathcal{G}_{B})( [ italic_p ] , caligraphic_G start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT , caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ).

Estimation.

The data for the experiments in Section 8.2 are generated as follows:

  1. 1.

    for every vV𝑣𝑉v\in Vitalic_v ∈ italic_V we sample ηvsubscript𝜂𝑣\eta_{v}italic_η start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT from a Laplace distribution with mean zero and standard deviation svU(0.2,3)similar-tosubscript𝑠𝑣U0.23s_{v}\sim\text{U}(0.2,3)italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ∼ U ( 0.2 , 3 ),

  2. 2.

    for every uv𝒢Babsent𝑢𝑣subscript𝒢𝐵u\xleftrightarrow{}v\in\mathcal{G}_{B}italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT we sample two independent random vectors ηu,v1,ηu,v2subscriptsuperscript𝜂1𝑢𝑣subscriptsuperscript𝜂2𝑢𝑣\eta^{1}_{u,v},\eta^{2}_{u,v}italic_η start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT , italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT, again with standard deviations su,v1,su,v2U(0.2,3)similar-tosubscriptsuperscript𝑠1𝑢𝑣subscriptsuperscript𝑠2𝑢𝑣U0.23s^{1}_{u,v},s^{2}_{u,v}\sim\text{U}(0.2,3)italic_s start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT , italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u , italic_v end_POSTSUBSCRIPT ∼ U ( 0.2 , 3 ),

  3. 3.

    for every vV𝑣𝑉v\in Vitalic_v ∈ italic_V, we have εv=ηv+uv𝒢B(wuvv,1ηuv1+wuvv,2ηuv2)subscript𝜀𝑣subscript𝜂𝑣subscriptabsent𝑢𝑣subscript𝒢𝐵subscriptsuperscript𝑤𝑣1𝑢𝑣subscriptsuperscript𝜂1𝑢𝑣subscriptsuperscript𝑤𝑣2𝑢𝑣subscriptsuperscript𝜂2𝑢𝑣\varepsilon_{v}=\eta_{v}+\sum_{u\xleftrightarrow{}v\in\mathcal{G}_{B}}(w^{v,1}% _{uv}\eta^{1}_{uv}+w^{v,2}_{uv}\eta^{2}_{uv})italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = italic_η start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_u start_METARELOP start_OVERACCENT end_OVERACCENT ↔ end_METARELOP italic_v ∈ caligraphic_G start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_w start_POSTSUPERSCRIPT italic_v , 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT italic_η start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT + italic_w start_POSTSUPERSCRIPT italic_v , 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT italic_η start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT ), where wuvv,1,wuvv,2U(5,5)similar-tosubscriptsuperscript𝑤𝑣1𝑢𝑣subscriptsuperscript𝑤𝑣2𝑢𝑣U55w^{v,1}_{uv},w^{v,2}_{uv}\sim\text{U}(-5,5)italic_w start_POSTSUPERSCRIPT italic_v , 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT , italic_w start_POSTSUPERSCRIPT italic_v , 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT ∼ U ( - 5 , 5 ),

  4. 4.

    for every uv𝒢𝑢𝑣𝒢u\to v\in\mathcal{G}italic_u → italic_v ∈ caligraphic_G, λuvU(5,5)similar-tosubscript𝜆𝑢𝑣U55\lambda_{uv}\sim\text{U}(-5,5)italic_λ start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT ∼ U ( - 5 , 5 ), and Xv=upa(v)λuvXu+εvsubscript𝑋𝑣subscript𝑢pa𝑣subscript𝜆𝑢𝑣subscript𝑋𝑢subscript𝜀𝑣X_{v}=\sum_{u\in\mathop{\rm pa}\nolimits(v)}\lambda_{uv}X_{u}+\varepsilon_{v}italic_X start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_u ∈ roman_pa ( italic_v ) end_POSTSUBSCRIPT italic_λ start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT + italic_ε start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT.

C.2 Additional Experiments.

Fig. 13 shows the performance of our methods when using the RBF kernel, with bandwidth computed using the median heuristic. We see that compared to the polynomial kernel, this choice seems to suffer more from the non-convexity of the objective function. In contrast, it provides a better estimate when initialized at the true parameter value.

The results for the same data generating process, but using a uniform distribution for the error terms, are shown in Fig. 14.

Refer to caption
Figure 13: The performance of our proposed estimator with a Gaussian kernel, for different initial values, “EL” stands for Empirical Likelihood, “REG” for regression coefficient, and “TV” for true value. We use the normalized Frobenious loss between the estimated matrix Λ^^Λ\hat{\Lambda}over^ start_ARG roman_Λ end_ARG, and the true matrix ΛΛ\Lambdaroman_Λ, i.e., Λ^ΛF/ΛFsubscriptnorm^ΛΛ𝐹subscriptnormΛ𝐹||\hat{\Lambda}-\Lambda||_{F}/||\Lambda||_{F}| | over^ start_ARG roman_Λ end_ARG - roman_Λ | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT / | | roman_Λ | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT, as loss function. We report the mean loss over one hundred randomly sampled ΛΛ\Lambdaroman_Λ. Notice that the y𝑦yitalic_y-axis is on a log-scale.
Refer to caption
Refer to caption
Figure 14: The performance of our proposed estimator with a polynomial kernel (top) and RBF kernel (bottom) for different initial values. “EL” stands for Empirical Likelihood, “REG” for regression coefficient, and “TV” for true value. We use the normalized Frobenius loss between the estimated matrix Λ^^Λ\hat{\Lambda}over^ start_ARG roman_Λ end_ARG and the true matrix ΛΛ\Lambdaroman_Λ, i.e., Λ^ΛF/ΛFsubscriptnorm^ΛΛ𝐹subscriptnormΛ𝐹||\hat{\Lambda}-\Lambda||_{F}/||\Lambda||_{F}| | over^ start_ARG roman_Λ end_ARG - roman_Λ | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT / | | roman_Λ | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT, as the loss function. We report the mean loss over one hundred randomly sampled ΛΛ\Lambdaroman_Λ. Notice that the y𝑦yitalic_y-axis is on a log scale.