Random Tensor Networks with Non-trivial Links

Newton Cheng¹,
Cécilia Lancien²,
Geoff Penington^1,3,
Michael Walter⁴ &
…
Freek Witteveen ORCID: orcid.org/0000-0002-3731-4455⁵

1414 Accesses
5 Citations
Explore all metrics

Abstract

Random tensor networks are a powerful toy model for understanding the entanglement structure of holographic quantum gravity. However, unlike holographic quantum gravity, their entanglement spectra are flat. It has therefore been argued that a better model consists of random tensor networks with link states that are not maximally entangled, i.e., have non-trivial spectra. In this work, we initiate a systematic study of the entanglement properties of these networks. We employ tools from free probability, random matrix theory, and one-shot quantum information theory to study random tensor networks with bounded and unbounded variation in link spectra, and in cases where a subsystem has one or multiple minimal cuts. If the link states have bounded spectral variation, the limiting entanglement spectrum of a subsystem with two minimal cuts can be expressed as a free product of the entanglement spectra of each cut, along with a Marchenko–Pastur distribution. For a class of states with unbounded spectral variation, analogous to semiclassical states in quantum gravity, we relate the limiting entanglement spectrum of a subsystem with two minimal cuts to the distribution of the minimal entanglement across the two cuts. In doing so, we draw connections to previous work on split transfer protocols, entanglement negativity in random tensor networks, and Euclidean path integrals in quantum gravity.

Reflected entropy in random tensor networks

Article Open access 24 May 2022

Entanglement of purification: from spin chains to holography

Article Open access 22 January 2018

Negativity spectra in random tensor networks and holography

Article Open access 10 February 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

More than twenty years after its discovery, the AdS/CFT correspondence [49] remains the only known example of a theory of quantum gravity.^{Footnote 1} A crucial feature of this correspondence is that the emergence of a (classical) spacetime is closely related to the entanglement structure of the boundary theory. Tensor networks appear to provide useful toy models for this aspect of AdS/CFT, mirroring many of its expected properties in a setting that can be made completely mathematically rigorous [40, 61, 70, 71]. A particularly powerful model is given by random tensor networks, which have the advantage of being highly analytically tractable, while exhibiting remarkably precise agreement with gravitational calculations (even including certain exponentially small corrections) [29, 40, 42, 60, 62,63,64, 80]. Random tensors and tensor networks also arise in a number of other fields of physics, including quantum information, where they have been used to explore generic entanglement properties of quantum states [2, 5, 9,10,11, 17, 18, 20,21,22, 37, 39, 46, 50, 57, 79] and condensed matter physics, e.g., in the study of random circuits and measurements [44, 47, 48, 53, 55, 77, 81, 82].

The most basic version of a random tensor network is characterized by a choice of bond dimension D and a graph $G = (V,E)$, where the vertices $V = V_b \sqcup V_\partial $ of G are partitioned into “bulk” vertices $V_b$ and “boundary” vertices $V_\partial $. To each edge $e \in E$, we associate a maximally entangled state

$$\begin{aligned} \frac{1}{\sqrt{D}} \sum _{i=1}^D \vert ii\rangle \end{aligned}$$

(1.1)

on two D-dimensional Hilbert spaces, one of which is associated with each endpoint of e; each vertex $v \in V$ is therefore associated with a Hilbert space ${\mathcal {H}}_v$ of dimension $D^{\deg (v)}$. Finally, we project each bulk vertex $v_b \in V_b$ onto a Haar random state $\vert \psi _{v_b}\rangle \in {\mathcal {H}}_{v_b}$. The resulting “random tensor network state” lives in the Hilbert space ${\mathcal {H}}_\partial = \otimes _{v_\partial } {\mathcal {H}}_{v_\partial }$ associated with the boundary vertices $v_\partial \in V_\partial $, as shown in Fig. 1. Such states, obtained by projecting maximally entangled edge states onto (not necessarily random) bulk vertex states, are also known as projected entangled pair states (PEPS) in the condensed matter literature [23, 75, 76].

To characterize the typical entanglement structure of random tensor network states, we can compute the von Neumann entropy $H(\rho _A)$ of the reduced density matrix $\rho _A$ on a subset $A \subset V_\partial $ of the boundary vertices. In the limit where the bond dimension D is very large, this entropy can be shown to converge with high probability to $\log (D) \left|\gamma _A\right|$, where $\gamma _A$ is the set of edges crossing the minimal cut (for the moment, assumed to be the unique such cut) in the graph separating A from its boundary complement $V_\partial {\setminus } A$ (see Fig. 2a). This formula is closely analogous to the Ryu–Takayanagi (RT) formula and its generalizations in AdS/CFT, in which entropies are given by the area of minimal surfaces homologous to a subregion of the conformal boundary [32, 33, 41, 45, 65, 66]. Indeed, this connection is one of the primary reasons for studying tensor networks as a toy model of quantum gravity. However, if one goes beyond the von Neumann entropy and studies finer details of the entanglement spectrum of random tensor network states, significant divergences from holography begin to appear, as we will see shortly.

When studying entanglement in either random tensor networks or quantum gravity, or more generally in quantum field theory, it is often convenient to study kth Rényi entropies $H_k(\rho _A) = (1-k)^{-1}\log {{\,\textrm{tr}\,}}[\rho _A^k]$. For integer $k > 1$, these are more amenable to direct computation than the von Neumann entropy, and one can extract the von Neumann entropy by analytic continuation to $k = 1$. The computation of Rényi entropies in random tensor network models is very similar to holographic computations. In both cases, the idea is to use the replica trick—essentially, this is the observation that ${{\,\textrm{tr}\,}}[\rho _A^k] = {{\,\textrm{tr}\,}}[\tau \rho _A^{\otimes k}]$ where $\tau $ is an operator which permutes the k copies of A cyclically. In the holographic computation, this can be written as a path integral, on k copies of the theory, glued together in an appropriate way. By the holographic dictionary, this path integral can then be computed by the action of a bulk geometry with certain boundary conditions [45]. For random tensor networks, one finds that ${{\,\textrm{tr}\,}}[\rho _A^k]$ concentrates around its expectation and can be computed as the partition function of a classical spin model on the bulk vertices, with boundary conditions dictated by the choice of boundary subsystem [40]. This computation will be explained in detail in Sect. 2.1. From these computations, one finds that holographic CFT states and random tensor network states behave quite differently when $k \ne 1$. For random tensor network states, the Rényi entropies are approximately independent of k in the large D limit, meaning their entanglement spectrum is close to “flat”; the boundary state $\rho _A$ is approximately maximally mixed within a certain subspace. On the other hand, CFT states that are dual to semiclassical spacetime geometries have Rényi entropies that vary non-trivially with k, meaning their entanglement spectrum contains a wide range of eigenvalues that contribute significantly to the state. Recently, it has been argued that the class of “fixed-area states” in AdS/CFT do have flat spectra and more generally have an entanglement structure that closely matches random tensor network states [8, 14, 27, 29, 54]. Fixed-area states have a well-defined semiclassical geometry associated with a fixed spatial slice; however, thanks to the uncertainty principle, they cannot describe a single semiclassical spacetime geometry [14]. We discuss the connection between these states and random tensor networks in more detail in “Appendix A.”

In the random tensor network model, the flatness of the spectrum can be traced to the maximally entangled states used as “link states” (see Eq. (1.1) and Fig. 1b) on the edges of the graph, which themselves have flat entanglement spectra. To take results about random tensor networks beyond the fixed-area state regime, it is natural—see, e.g., discussion in [14, 40]—to replace the maximally entangled link states by general states

$$\begin{aligned} \vert \phi _e\rangle = \sum _{i = 1}^D \sqrt{\lambda _{e,i}} \vert ii\rangle . \end{aligned}$$

(1.2)

The variation in the entanglement spectrum $\lambda _{e,i}$ represents the fluctuations in area in semiclassical gravitational states. We introduce this model in Sect. 2. The goal of this paper will be to understand the entanglement spectra of random tensor networks with such non-trivial link states and spectra in a number of different regimes. In fact, a number of our results apply in an even more general setting, where the product of link states $\otimes _e \vert \phi _e\rangle $ is replaced by a completely general “background state” density matrix $\phi _V$. From a quantum gravity perspective, random tensor networks with general background states are needed to model bulk quantum fields in AdS/CFT, which yield significant physical consequences when the bulk entropies are large [6, 7]. Beyond holography, they also play a central role in the quantum information processing tasks of multiparty state merging split transfer [26], a connection we elaborate on in “Appendix B.”

If one considers a random tensor network with non-trivial link states, if there is a single minimal cut $\gamma _A$ for a subsystem A, then the resulting density matrix $\rho _A$ will have an entanglement spectrum that converges to that of $\left|\gamma _A\right|$ copies of the link state along the minimal cut as $D \rightarrow \infty $; indeed, this was implicit in [40]. A more complex question, and the main focus of this work, is the case where there are two minimal cuts, as in Fig. 2b. This situation is motivated by questions in holography: It can be used to study the phase transition at the point where there are two competing minimal surfaces [6, 54], which has been relevant to recent advances on the black hole information paradox [1, 4, 51, 59, 60].

For our first main result, in Sect. 3, we consider a family of link states with increasing bond dimension D as in Eq. (1.2). For each D, the link state has an associated distribution

$$\begin{aligned} \mu _e^{(D)} = \frac{1}{D}\sum _{i=1}^D \delta _{D\lambda _{e,i}}, \end{aligned}$$

where $\delta _{x}$ is a $\delta $-distribution centered at x, so this is the discrete probability distribution given by a uniform distribution over the spectrum of the link state. Note that $\lambda _{e,i}=\lambda _{e,i}^{(D)}$ also depends on D. We then require that the moments

$$\begin{aligned} m_k^{(D)} = D^{k-1} \sum _{i=1}^D \lambda _{e,i}^k \end{aligned}$$

of the distributions $\mu _e^{(D)}$ converge to a finite limit as $D\rightarrow \infty $ for all positive integer k. We refer to this as the bounded spectral variation limit. This means, in particular, that we must have $\lambda _{e,i} = {\mathcal {O}}(1/D)$ for all but a vanishing fraction of the eigenvalues $\lambda _{e,i}$. If we let $\gamma _A$ denote a minimal cut for a boundary domain A, then we may similarly define the associated distribution

$$\begin{aligned} \mu _{\gamma _A}^{(D)} =\frac{1}{D^{|\gamma _A|}} \sum _{\{i_e\}_{e \in \gamma _A}} \delta _{D^{|\gamma _A|} \prod _{e \in \gamma _A} \lambda _{e,i_e}}. \end{aligned}$$

By assumption, the moments of the distribution $\mu _{\gamma _A}^{(D)}$ converge to a finite limit (because those of each distribution $\mu _e^{(D)}$ do), implying that $\mu _{\gamma _A}^{(D)}$ converges weakly to some distribution $\mu _{\gamma _A}$. Now, consider the empirical distribution of the spectrum of reduced state $\rho _A$, which is the (random) distribution

$$\begin{aligned} \mu _A^{(D)} = \frac{1}{D^{\left|\gamma _A\right|}} \sum _{\lambda \in {{\,\textrm{spec}\,}}(\rho _A)} \delta _{D^{\left|\gamma _A\right|} \lambda }. \end{aligned}$$

In the case where there are two non-intersecting minimal cuts $\gamma _{A,1}$ and $\gamma _{A,2}$, we find that $\mu ^{(D)}_A$ converges weakly, in probability, to a limiting distribution $\mu _A$ given by a free product ${{\,\textrm{MP}\,}}(1) \boxtimes \mu _{\gamma _{A,1}} \boxtimes \mu _{\gamma _{A,2}}$, a notion from the theory of free probability. Here, ${{\,\textrm{MP}\,}}(1)$ is the Marchenko–Pastur distribution of parameter 1. The situation is summarized by our first main result, which we state more precisely as Theorem 3.4:

Theorem

(Informal). Consider a family of link states in the bounded spectral variation limit. If there is a unique minimal cut $\gamma _A$ for a boundary subsystem A, then $\mu _A^{(D)}$ converges weakly, in probability, to $\mu _{\gamma _A}$, while if there are exactly two non-intersecting minimal cuts $\gamma _{A,1}$ and $\gamma _{A,2}$, it converges to ${{\,\textrm{MP}\,}}(1) \boxtimes \mu _{\gamma _{A,1}} \boxtimes \mu _{\gamma _{A,2}}$.

In Sect. 3.3, we briefly discuss the closely related problem of computing the entanglement negativity spectrum in the same regime.

For our second main result, in Sect. 4, we investigate a different regime, in which link states are allowed to have unbounded spectral variation in the large D limit. This is the more relevant regime for holography, where fluctuations in the area of a surface (in Planck units) grow sublinearly but without bound in the semiclassical limit. When the spectral variation is unbounded, there still exists a reasonable notion of a minimal cut that determines the entanglement spectrum of the boundary state, but the key difference is that minimality must now be defined entropically, rather than geometrically. In fact, the underlying graph essentially plays no role in this regime. A sensible way to formalize this would be to use one-shot entropies: We might say that a cut is “minimal” if the rank of the state along the cut is smaller than the inverse of the largest element in the entanglement spectrum along any other cut. This condition, while intuitive, is a little too restrictive, and one can use smooth conditional entropies to get a weaker, but still meaningful, condition. In Definition 4.6, following [6] we introduce the notion of a unique $(\varepsilon , K)$-minimal cut $\Gamma _A$, where $\Gamma _A\subseteq V$ is a subset of vertices such that the $\varepsilon $-smooth conditional min-entropy of $\Gamma _A$ compared to competing cuts either contained in $\Gamma _A$ or containing $\Gamma _A$ is lower bounded by K. Intuitively, we want K to be as large as possible, and indeed, we show in Theorem 4.7 that the spectrum of a reduced density matrix $\rho _A$ will be close to the spectrum along a unique $(\varepsilon , K)$-minimal cut with an error exponentially small in K.

The situation is more complicated if there are two non-intersecting $(\varepsilon , K)$-minimal cuts, defined in Definition 4.10 as both cuts satisfying an $(\varepsilon , K)$-minimality property for the same $\varepsilon $ and K. Here, we need to impose a regularity condition on the link states, motivated by the example of a link state that is a (large) number of copies of a fixed state:

$$\begin{aligned} \vert \phi _e\rangle = \vert \phi _0\rangle ^{\otimes n} \end{aligned}$$

(1.3)

where $\vert \phi _0\rangle $ is a bipartite state with local dimension d, so the total bond dimension is $D = d^n$. In this case, if $\phi _0$ is not maximally entangled, the measure $\mu _e^{(D)}$ will not converge with increasing n as the spectrum is not concentrated around $\frac{1}{D}$. However, using a different measure, the entanglement spectrum of $\vert \phi _e\rangle $ satisfies a central limit theorem. Namely, if $\vert \phi _e\rangle $ has Schmidt coefficients $\{\lambda _i\}$, and $\vert \phi _0\rangle $ has entanglement entropy $H_0$, then the distribution of the random variable $X^{(n)}$ which takes values $\frac{1}{\sqrt{n}}(\log (\frac{1}{\lambda _i}) - nH_0)$ with probability $\lambda _i$, converges weakly to a centered Gaussian distribution as $n\rightarrow \infty $. Since we subtracted the entropy $nH_0$, the random variable $X^{(n)}$ has expectation zero. Its variance can be thought of as a measure of the fluctuation of $\log (\frac{1}{\lambda _i})$ around the entropy and is relevant for second-order asymptotic rates in quantum information processing tasks [72]. We take this central limit theorem as motivation for a regularity condition on the spectra of general link and background states, and we allow the states to have varying bond dimensions D(n), e.g., $D(n) \sim d^n$. To be more precise, we define the following measure along a cut $\gamma _A$:

$$\begin{aligned} \nu _{\gamma _A}^{(n)} = \sum _{\{i_e\}_{e \in \gamma _A}} \prod _{e \in \gamma _A} \lambda _{e,i_e} \delta _{\frac{1}{\sqrt{n}}\left( \sum _{e \in \gamma _A} \log \frac{1}{\lambda _{e,i_e}} - H(n)\right) }, \end{aligned}$$

where H(n) is a function of n (which one can think of as being approximately equal to the entanglement entropy along the cut $\gamma _A$), and we assume that $\nu _{\gamma _A}^{(n)}$ converges weakly to a continuous distribution. Note that the distribution described above reduces to the distribution of $X^{(n)}$ if the link state is of the form in Eq. (1.3) and is very different from the distribution $\mu _A^{(D)}$, we study for link states with bounded spectral variation. Similarly, we let

$$\begin{aligned} \nu _{A}^{(n)} = \sum _{\lambda \in {{\,\textrm{spec}\,}}(\rho _A)} \lambda \, \delta _{\frac{1}{\sqrt{n}}(\log (\frac{1}{\lambda }) - H(n))} \end{aligned}$$

be the corresponding (random) distribution for the boundary spectrum. Knowledge of this distribution allows computation of the entropy of $\rho _A$ (and fluctuations) as a correction to H(n). In the random tensor network setting, we find that in a situation with two competing minimal cuts, the random tensor network will ‘select’ the minimal parts of each cut, in the following sense:

Theorem

(Informal). Assume that we have a family of (states with) two non-intersecting $(\varepsilon (n),K(n))$-minimal cuts $\gamma _{A,1}$ and $\gamma _{A,2}$, as defined in Definition 4.10. Suppose the entanglement spectra along the two minimal cuts are such that $\nu _{\gamma _{A,1}}^{(n)}$ and $\nu _{\gamma _{A,2}}^{(n)}$ converge weakly to continuous measures $\nu _1$ and $\nu _2$, respectively, as $n \rightarrow \infty $. Then, $\nu _{A}^{(n)}$ converges weakly, in probability, to $\textrm{min}_*(\nu _1,\nu _2)$ which is the pushforward of $\nu _1$ and $\nu _2$ along the function $\min :\mathbbm {R}\times \mathbbm {R}\rightarrow \mathbbm {R}$. In other words, for any bounded continuous function $f \in C_b(\mathbbm {R})$

$$\begin{aligned} \sum _{\lambda \in {{\,\textrm{spec}\,}}(\rho _A)} \lambda \, f\left(\frac{\log (\frac{1}{\lambda }) - H(n)}{\sqrt{n}}\right) \rightarrow \int \int f(\min (x_1,x_2)) \, \textrm{d}\nu _1(x_1) \textrm{d}\nu _2(x_2)\nonumber \\ \end{aligned}$$

(1.4)

in probability.

We also show in Corollary 4.14 that, up to an error of size ${\mathcal {O}}(\log (n))$, this allows us to compute the entropy of $\rho _A$.

An analogous statement to (1.4) was previously conjectured in [54] to be valid in quantum gravity and justified at a physics level of rigor in [6]. Our proof of (1.4) is closely related to the arguments in [6], but converting the physical arguments into a mathematical proof and careful controlling the relevant sources of error requires significant technical work, and constitutes the majority of Sect. 4.

In “Appendix A,” we relate our results to the study of holographic gravity computations, particularly in situations with competing minimal surfaces. This is not needed to understand the results of this work, but provides additional motivation for the relevance of our results and clarifies the way in which random tensor networks provide a useful toy model for holographic quantum gravity. Then, in “Appendix B,” we discuss the relation of our results to split transfer and its relevance in holography. This is again not needed to understand the main text, but rather pointing at topics our work is connected to. Next, in “Appendix C,” we prove two technical lemmas on joint smoothing that are necessary to analyze competing minimal cuts in the unbounded spectral variation regime. Finally, in “Appendix D,” we prove that a certain function on the symmetric group is a metric; this is not directly used in the current work, but may be of independent interest for the study of random tensor networks.

After the completion of this manuscript, we became aware of independent work by Jinzhao Wang [78] on the use of free products to describe entanglement in the toy model of quantum gravity introduced in [60] that has strong overlap with the ideas in Sect. 3.2 and “Appendix A.”

1.1 Notation and Conventions

For $k \in \mathbbm {N}$, we let $[k] = \{1, \ldots , k\}$, and we denote by $S_k$ the group of permutations of this set. We denote by $\Vert a\Vert _p$ the $\ell _p$-norm of a vector a, defined by $\Vert a\Vert _p^p = \sum _i \left|a_i\right|^p$. If a and b are vectors of different dimension, we extend the shorter vector by zeros and still write $\Vert a-b\Vert _p$ for their distance. For example, if $a \in \mathbbm {C}^{d_1}$ and $b \in \mathbbm {C}^{d_2}$ with $d_2 > d_1$, we write $\Vert a - b\Vert _1 = \sum _{i=1}^{d_1} \left|a_i - b_i\right| + \sum _{i=d_1 + 1}^{d_2} \left|b_i\right|$. We also denote by $\Vert A\Vert _p$ the Schatten p-norm of an operator A, defined as the $\ell _p$-norm of the singular values $\{s_i\}$ of A; it can also be computed by $\Vert A\Vert _p^p = {{\,\textrm{tr}\,}}((A^\dagger A)^{p/2})$. The operator norm is given by $\Vert A\Vert _{\infty } = \max \{s_i\}$. If ${\mathcal {H}}$ is a Hilbert space, we introduce the notation ${\mathcal {P}}({\mathcal {H}})$ for the set of positive semidefinite operators on ${\mathcal {H}}$. We often refer to positive semidefinite operators as “density operators” or “states,” without requiring them to be normalized to unit trace. We write ${\mathcal {P}}_{\scriptscriptstyle {=}}({\mathcal {H}})$ for the set of $\rho \in {\mathcal {P}}({\mathcal {H}})$ with unit trace, ${{\,\textrm{tr}\,}}[\rho ] = 1$, and we denote by ${\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$ the set of subnormalized states, that is, $\rho \in {\mathcal {P}}({\mathcal {H}})$ with ${{\,\textrm{tr}\,}}[\rho ] \le 1$. We use the convention that for a vector $\vert \phi \rangle $, we denote the corresponding pure state by $\phi $, so $\phi = \mathinner {|\phi \rangle \langle \phi |}$. Given a positive semidefinite operator $\rho $, we denote by ${{\,\textrm{spec}\,}}(\rho )$ the vector containing its spectrum in non-increasing order, and we write ${{\,\textrm{spec}\,}}_+(\rho )$ for the nonzero part of the spectrum. It is a well-known fact that

$$\begin{aligned} \Vert {{\,\textrm{spec}\,}}(\rho ) - {{\,\textrm{spec}\,}}(\sigma )\Vert _1 = \Vert {{\,\textrm{spec}\,}}_+(\rho ) - {{\,\textrm{spec}\,}}_+(\sigma )\Vert _1 \le \Vert \rho - \sigma \Vert _1 \end{aligned}$$

(1.5)

. (In the second expression, we use the convention for the distance of vectors of possibly different dimension introduced above.) If A is a quantum system with Hilbert space ${\mathcal {H}}_A$, we write ${\mathcal {P}}(A) = {\mathcal {P}}({\mathcal {H}}_A)$, ${\mathcal {P}}_{\scriptscriptstyle {=}}(A) = {\mathcal {P}}_{\scriptscriptstyle {=}}({\mathcal {H}}_A)$, and ${\mathcal {P}}_{\scriptscriptstyle {\le }}(A) = {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}}_A)$, and we use subscripts, e.g., $\rho _A \in {\mathcal {P}}(A)$, to indicate which system and Hilbert space a quantum state is associated with. For a bipartite state $\rho _{AB} \in {\mathcal {P}}(AB)$ defined on a tensor product of Hilbert spaces ${\mathcal {H}}_{AB} = {\mathcal {H}}_A \otimes {\mathcal {H}}_B$, we obtain the reduced state or reduced density operator $\rho _A \in {\mathcal {P}}(A)$ by taking the partial trace over the complement: $\rho _A = {{\,\textrm{tr}\,}}_B[\rho _{AB}]$, and similarly in multipartite situations. Finally, we adopt the standard notation that if $\mu _n$ is some sequence of finite measures, we write $\mu _n \Rightarrow \mu $ if $\mu _n$ converges weakly (or in distribution) to a finite measure $\mu $, meaning that for any bounded continuous function $f \in C_b(\mathbbm {R})$,

$$\begin{aligned} \int f(x) \textrm{d}\mu _n(x) \rightarrow \int f(x) \textrm{d}\mu (x). \end{aligned}$$

(1.6)

If $\mu _n$ is a sequence of random finite measures on $\mathbbm {R}$, we say that the sequence $\mu _n$ converges weakly, in probability, to a finite measure $\mu $, if, for any bounded continuous function $f \in C_b(\mathbbm {R})$, Eq. (1.6) converges in probability, i.e., if for every $\varepsilon > 0$ we have that

$$\begin{aligned} \lim _{n \rightarrow \infty } \Pr \left( \left|\int f(x) \textrm{d}\mu _n(x) - \int f(x) \textrm{d}\mu (x) \right| \ge \varepsilon \right) = 0. \end{aligned}$$

In this situation, we will also write $\mu _n \Rightarrow \mu $, in probability. All logarithms are to base 2.

2 Random Tensor Network States

We first review the random tensor network model, closely following [29, 40]. Let $G = (V,E)$ be a connected undirected graph, and let $V = V_\partial \sqcup V_b$ be a partition of the vertices into a set of boundary vertices $V_\partial $ and bulk vertices $V_b$. If $A \subseteq V_\partial $, we write ${\bar{A}} = V_\partial {\setminus } A$. We assign a bond dimension $D_e$ to each edge, and we will consider families of states with increasing bond dimensions; for example, we may take $D_e = D$ for all edges and let D increase. For each vertex $x \in V$, let $\partial \{x\}$ denote the set of edges $e = (xy) \in E$ connecting x to some $y \in V$. We define Hilbert spaces ${\mathcal {H}}_{e,x} = \mathbbm {C}^{D_e}$ for $e \in \partial \{x\}$, and ${\mathcal {H}}_x:= \bigotimes _{e \in \partial \{x\}} {\mathcal {H}}_{e,x}$. We call the pair (e, x) a half-edge. Moreover, we write ${\mathcal {H}}_{e} = {\mathcal {H}}_{e,x} \otimes {\mathcal {H}}_{e,y}$ for an edge $e = (xy) \in E$. Let $D_x = \dim ({\mathcal {H}}_x)$. For a subset $A \subseteq V$, we write ${\mathcal {H}}_A = \bigotimes _{x \in A} {\mathcal {H}}_x$, and similarly, for a subset $S \subseteq E$ we write ${\mathcal {H}}_S = \bigotimes _{e \in S} {\mathcal {H}}_e$. Similarly, for a set T of half-edges we write ${\mathcal {H}}_T = \bigotimes _{(e,x) \in T} {\mathcal {H}}_{e,x}$. At each edge $e = (xy) \in E$, we place a pure state $\vert \phi _e\rangle \in {\mathcal {P}}_{\scriptscriptstyle {=}}({\mathcal {H}}_e)$

$$\begin{aligned} \vert \phi _{e}\rangle = \sum _{i=1}^{D_e} \sqrt{\lambda _{e,i}} \vert ii\rangle \in {\mathcal {H}}_e = {\mathcal {H}}_{e,x} \otimes {\mathcal {H}}_{e,y}, \end{aligned}$$

(2.1)

that we call a link state. Then, $\phi _{e,x} = \phi _{e,y} = \sum _{i=1}^D \lambda _{e,i} \mathinner {|i\rangle \langle i|}$ is the reduced density matrix of the link state on either of the two subsystems. We refer to the vector ${{\,\textrm{spec}\,}}(\phi _{e,x}) = {{\,\textrm{spec}\,}}(\phi _{e,y})$, which is ordered in non-increasing fashion, as the entanglement spectrum of $\phi _e$. Let $\phi \in {\mathcal {P}}_{\scriptscriptstyle {=}}(V)$ be the full state on edges given by the tensor product of link states

$$\begin{aligned} \vert \phi \rangle = \bigotimes _{e \in E} \vert \phi _{e}\rangle . \end{aligned}$$

(2.2)

At every bulk vertex $x \in V_b$, we place a random vector $\vert \psi _x\rangle \in {\mathcal {H}}_x$, where the entries of $\vert \psi _x\rangle $ are independent standard (circularly symmetric) complex Gaussian random variables: Each entry of the tensor can be written as $\frac{1}{\sqrt{2}}(x + iy)$ where x and y are independent real Gaussian random variables of mean 0 and unit variance. We note that, in the model of [40], the tensors $\vert \psi _x\rangle $ were not chosen as random Gaussian vectors, but as uniformly random vectors on the unit sphere. However, for our choice of Gaussian $\vert \psi _x\rangle $, the norm $\Vert \vert \psi _x\rangle \Vert $ is independent of the normalized vector $\vert \psi _x\rangle /\Vert \vert \psi _x\rangle \Vert $, and $\vert \psi _x\rangle /\Vert \vert \psi _x\rangle \Vert $ will be a uniformly random vectors on the unit sphere. Therefore, these two models only differ by their normalization. We write $\vert \psi \rangle = \bigotimes _{x \in V_b} \vert \psi _x\rangle $. The resulting random tensor network state $\rho \in {\mathcal {P}}(V_\partial )$ is defined by

$$\begin{aligned} \vert \rho \rangle = (I_{V_\partial } \otimes \langle \psi \vert ) \vert \phi \rangle . \end{aligned}$$

(2.3)

The random tensor network state is obtained by projecting the link states onto random vectors, so that the final state lives in the boundary Hilbert space. We can make this manifest by using the cyclicity of the trace to write the density matrix:

$$\begin{aligned} \rho = \left( I_{V_{\partial }} \otimes \langle \psi \vert \right) \phi \left( I_{V_{\partial }} \otimes \vert \psi \rangle \right) = {{\,\textrm{tr}\,}}_{V_b}\left[\left( I_{V_{\partial }} \otimes \psi \right) \phi \right]. \end{aligned}$$

(2.4)

Note that this state need not be normalized, but we chose the standard deviation of the $\vert \psi _x\rangle $ such that $\rho $ is normalized on average, given that the link state $\phi $ is normalized:

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}[\rho ] = {{\,\textrm{tr}\,}}[\phi ]. \end{aligned}$$

(2.5)

In Sect. 2.3.1, we prove the stronger statement that $\rho $ is normalized with high probability for appropriately connected tensor networks and large bond dimension. Note also, that in Eq. (2.1), we have chosen states which have a Schmidt decomposition in a fixed basis (the standard basis). Since we project onto uniformly random tensors, we can choose to do so without loss of generality.

2.1 The Replica Trick for Random Tensor Networks

We now consider a boundary subset $A \subseteq V_{\partial }$ and use the replica trick to study the Rényi entropies of the reduced density matrix $\rho _A$. The replica trick for random tensor network models was first studied in [40], and it is the key tool we apply throughout this work. Let ${\mathcal {H}}$ be a Hilbert space. The Rényi entropies of a (normalized) density matrix $\rho \in {\mathcal {P}}_{\scriptscriptstyle {=}}({\mathcal {H}})$ are defined by

$$\begin{aligned} H_k(\rho ) = \frac{1}{1-k}\log ({{\,\textrm{tr}\,}}[\rho ^k]) \end{aligned}$$

for $k \in (0,1) \cup (1,\infty )$. For $k = 0,1,\infty $, there are well-defined limits, given by

$$\begin{aligned} \begin{aligned} H_0(\rho )&:= \log ({{\,\textrm{rank}\,}}(\rho ))\\ H_1(\rho )&:= -{{\,\textrm{tr}\,}}[\rho \log (\rho )]\\ H_{\infty }&:= -\log (\Vert \rho \Vert _{\infty }). \end{aligned} \end{aligned}$$

(2.6)

In particular, we see that $H(\rho ) = H_1(\rho )$ is the von Neumann entropy. We will also write $H(A)_\rho := H(\rho _A)$ and $H_k(A)_{\rho }:= H_k(\rho _A)$ for reduced density matrices. If $\rho \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$ is subnormalized, we let

$$\begin{aligned} H_k(\rho ) = \frac{1}{1-k}\log \frac{{{\,\textrm{tr}\,}}\left[\rho ^k\right]}{{{\,\textrm{tr}\,}}[\rho ]}. \end{aligned}$$

(2.7)

Denote by R the representation of $S_k$ on ${\mathcal {H}}^{\otimes k}$ which permutes the k copies of ${\mathcal {H}}$ according to the action of $S_k$. We will write $R_x(\pi )$ when ${\mathcal {H}}= {\mathcal {H}}_x$ and $R_A(\pi )$ if ${\mathcal {H}}= {\mathcal {H}}_A$ for $A \subseteq V$. We let $\tau $ denote the standard k-cycle in $S_k$, i.e.,

$$\begin{aligned} \tau = (1 2 \ldots k). \end{aligned}$$

The key idea of the replica trick is the observation that the kth moment of $\rho \in {\mathcal {P}}({\mathcal {H}})$ can be written as

$$\begin{aligned} {{\,\textrm{tr}\,}}\left[\rho ^k\right] = {{\,\textrm{tr}\,}}\left[R(\tau )\rho ^{\otimes k}\right]. \end{aligned}$$

(2.8)

Recall the notion of the cycle type of a permutation $\pi $: If $\pi $ can be written as a product of m disjoint cycles of lengths $l_1, \ldots , l_m$, then $\pi $ has cycle type $C(\pi ) = \{l_1,\ldots ,l_m\}$. Then, for an arbitrary $\pi \in S_k$,

$$\begin{aligned} {{\,\textrm{tr}\,}}\left[ R(\pi )\rho ^{\otimes k}\right] = \prod _{l \in C(\pi )} {{\,\textrm{tr}\,}}\left[ \rho ^l \right]. \end{aligned}$$

Note that this is the generalization of the well-known swap trick for two copies of a state $\rho $. The other crucial ingredient is a property of the Gaussian random vectors:

$$\begin{aligned} \mathbbm {E}\left[\psi _x^{\otimes k}\right] = \sum _{\pi \in S_k} R_x(\pi ). \end{aligned}$$

(2.9)

Using Eq. (2.4), we may then compute

$$\begin{aligned} \begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}\left[\rho _A^k\right]&= \mathbbm {E}{{\,\textrm{tr}\,}}\left[R_A(\tau ) \rho _{A}^{\otimes k}\right] \\&= \mathbbm {E}{{\,\textrm{tr}\,}}\left[\left( R_A(\tau ) \otimes R_{V_\partial {\setminus } A}(\textrm{id})\right) \rho ^{\otimes k}\right]\\&= \mathbbm {E}{{\,\textrm{tr}\,}}\left[\left( R_A(\tau ) \otimes R_{V_\partial {\setminus } A}(\textrm{id})\right) \left( I_{V_{\partial }} \otimes \psi \right) ^{\otimes k} \phi ^{\otimes k}\right] \\&= {{\,\textrm{tr}\,}}\left[\left( R_A(\tau ) \otimes R_{V {\setminus } A}(\textrm{id})\right) \mathbbm {E}\left[(I_{V_\partial } \otimes \psi )^{\otimes k}\right] \phi ^{\otimes k}\right]. \end{aligned} \end{aligned}$$

(2.10)

To further simplify this expression, we define the following set:

$$\begin{aligned} {\mathcal {S}}_{A,\sigma } = \left\{ \{\pi _x\}_{x \in V} : \pi _x \in S_k, \text { where } \pi _x = \sigma \text { for } x \in A \text { and } \pi _x = \textrm{id}\text { for } x \in {\bar{A}} \right\} , \end{aligned}$$

(2.11)

for any $\sigma \in S_k$ and $A \subseteq V_\partial $. An element of ${\mathcal {S}}_{A,\sigma }$ assigns a permutation to each vertex in V subject to a “boundary condition.” Now, using Eq. (2.9), we find that

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}\left[\rho _A^k\right]&= \sum _{\{\pi _x\} \in {\mathcal {S}}_{A,\tau }} {{\,\textrm{tr}\,}}\left[\bigotimes _{x \in V} R_x(\pi _x) \phi ^{\otimes k}\right]. \end{aligned}$$

Finally, we observe that for $e = (xy)$

$$\begin{aligned} {{\,\textrm{tr}\,}}\left[R(\pi _x) \otimes R(\pi _y) \phi ^{\otimes k}\right] = \prod _{l \in C(\pi _x^{-1} \pi _y)} {{\,\textrm{tr}\,}}\left[\phi _{e,x}^l\right], \end{aligned}$$

where we recall that $\phi _{e,x}$ is the reduced density matrix of the link state on edge $e = (xy)$. Thus, we conclude that

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}\left[\rho _A^k\right]&= \sum _{\{\pi _x\} \in {\mathcal {S}}_{A,\tau }} \prod _{e = (xy) \in E} \prod _{l \in C(\pi _x^{-1} \pi _y)} {{\,\textrm{tr}\,}}\left[\phi _{e,x}^l\right]. \end{aligned}$$

(2.12)

We can interpret the expectation as the partition function of a classical spin model

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}\left[\rho _A^k\right]&= \sum _{\{\pi _x\} \in {\mathcal {S}}_{A,\tau }} 2^{-\sum _{e = (xy) \in E} J_{e}(\pi _x, \pi _y)}, \end{aligned}$$

(2.13)

where the site variables in the spin model are permutations $\pi _x \in S_k$, and the interaction at the edges between sites is given by

$$\begin{aligned} J_{e}(\pi _x,\pi _y) = -\sum _{l \in C(\pi _x^{-1} \pi _y)} \log ({{\,\textrm{tr}\,}}\left[\phi _{e,x}^l\right]) = \sum _{l \in C(\pi _x^{-1} \pi _y)} (l-1) H_l(\phi _{e,x}), \end{aligned}$$

with $H_l$ being the lth Rényi entropy, and the model as boundary conditions such that the permutation must be $\tau $ on A and $\textrm{id}$ on ${\bar{A}}$. It turns out that $J_e$ is a metric on the symmetric group $S_k$—see “Appendix D.” Similarly, we may place an arbitrary permutation $\pi $ on A instead of $\tau $, which yields (by exactly the same reasoning)

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}\left[R(\pi )\rho ^{\otimes k}\right]&= \sum _{\{\pi _x\} \in {\mathcal {S}}_{A,\pi }} \prod _{e = (xy) \in E} \prod _{l \in C(\pi _x^{-1} \pi _y)} {{\,\textrm{tr}\,}}\left[\phi _{e,x}^l\right]. \end{aligned}$$

(2.14)

2.2 Maximally Entangled Link States and Minimal Cuts

We will now discuss the special case where all the link states are maximally entangled states of dimension D, which has been studied extensively in [40]. We will generalize the results we discuss here to a wider class of link states in Sect. 3. In this case, the entanglement spectra of the link states are flat: For $e \in E$, we have $\lambda _{e,i} = \frac{1}{D}$ for $i = 1, \ldots , D$. In particular, for all $l \in \mathbbm {N}$ we have $H_l(\phi _{e}) = \log (D)$ and hence

$$\begin{aligned} J_{e}(\pi _x,\pi _y) = \log (D)\sum _{l \in C(\pi _x^{-1} \pi _y)} (l-1). \end{aligned}$$

This leads to the so-called Cayley distance on $S_k$:

$$\begin{aligned} d(\pi _x,\pi _y) = \sum _{l \in C(\pi _x^{-1} \pi _y)} (l-1) = k - \left|C(\pi _x^{-1} \pi _y)\right|, \end{aligned}$$

(2.15)

where $\left|C(\pi )\right|$ is the number of cycles in $\pi $. Moreover, $d(\pi _x,\pi _y)$ is a metric and equals the minimal number of transpositions needed to transform $\pi _x$ into $\pi _y$. We say that $\pi \in S_k$ is on a geodesic between $\pi _1$ and $\pi _2$ if $d(\pi _1, \pi ) + d(\pi , \pi _2) = d(\pi _1,\pi _2)$. (Recall that d is a metric.) We can rewrite the spin model in terms of this distance:

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}\left[\rho _A^k\right]&= \sum _{\{\pi _x\} \in {\mathcal {S}}_{A,\sigma }} 2^{-\log (D) \sum _{e = (xy) \in E} d(\pi _x, \pi _y)}. \end{aligned}$$

(2.16)

The physically inclined reader may observe that the logarithm of the bond dimension has the role of an inverse temperature, and for large D, the dominant contribution to the partition function will be the ground state of the spin model, subject to the relevant boundary conditions.

To describe the dominant contribution to the sum in Eq. (2.16) for large D, we need the minimal cuts for A in G. A cut for A is a subset of the vertices $\Gamma _A \subset V$ such that $\Gamma _A \cap V_{\partial } = A$. Throughout this work, we will denote the set of all cuts for A by C(A). We will use the convention of denoting cuts (i.e., subsets of vertices) by capital Greek letters. Given a cut $\Gamma _A \in C(A)$, we will denote the set of edges crossing the cut, that is, edges connecting a vertex in $\Gamma _A$ with a vertex in $V {\setminus } \Gamma _A$, by lowercase Greek letters $\gamma _A$. (And by an abuse of language, also refer to this set as a “cut.”) A minimal cut for A is a cut such that the number of edges $\left|\gamma _A\right|$ is minimal. We write $m(A) = \left|\gamma _A\right|$ for a minimal cut $\gamma _A \in C(A)$. If $\Gamma _A \in C(A)$, we write $\Gamma _A^c = V {\setminus } \Gamma _A$. Note that $\Gamma _A^c$ is a cut for ${\bar{A}} = V_{\partial } {\setminus } A$.

In the simplest case, there is a unique minimal cut $\gamma _A$. For this case, one can show that the dominant configuration is the one in which $\pi _x = \tau $ for $x \in \Gamma _A$ and $\pi _x = \textrm{id}$ for $x \in V {\setminus } \Gamma _A$, see [39], or Proposition 3.3. That is, there are two domains in the spin model corresponding to $\tau $ and $\textrm{id}$, and the minimization of the domain wall corresponds to the minimal cut in the graph.

We will also be interested in the case of exactly two non-intersecting minimal cuts $\Gamma _{A,1}$ and $\Gamma _{A,2}$. In this case, we have that $\Gamma _{A,1} \subset \Gamma _{A,2}$, or $\Gamma _{A,2} \subset \Gamma _{A,1}$. After relabeling, we may assume that the first is the case, and define three domains in the graph: $V = V_1 \sqcup V_2 \sqcup V_3$ given by $V_1 = \Gamma _{A,1}$, $V_2 = \Gamma _{A,2} {\setminus } \Gamma _{A,1}$ and $V {\setminus } \Gamma _{A,2}$. If there are exactly two minimal cuts, then multiple dominant configurations contribute equally to the partition function Eq. (2.16). These dominant configurations can be constructed as follows: For each $\pi $ on a geodesic between $\tau $ and $\textrm{id}$, set $\pi _x = \tau $ for $x \in V_1$, $\pi _x = \pi $ for $x \in V_2$ and $\pi _x = \textrm{id}$ for $x \in V_3$. That these are the dominant configurations follows immediately from the fact that $d(\tau , \pi ) + d(\pi , \textrm{id}) \ge d(\tau , \textrm{id})$, with equality if and only if $\pi $ is on a geodesic between $\tau $ and $\textrm{id}$.

To understand this degeneracy, we use the following fact [56]: the set of permutations $\pi $ on a geodesic between $\tau $ and $\textrm{id}$ is in a one-to-one correspondence with the set of non-crossing partitions NC(k) of [k]. See Sect. 3.1 for a definition and properties of NC(k). Thus, the degeneracy for the kth moment is $\left|NC(k)\right| = C_k$ where

$$\begin{aligned} C_k = \frac{1}{k+1} \left( {\begin{array}{c}2k\\ k\end{array}}\right) \end{aligned}$$

is the kth Catalan number. These are the moments of the Marchenko–Pastur distribution ${{\,\textrm{MP}\,}}(t)$

$$\begin{aligned} \begin{aligned} MP(t)&= \max (1 - t,0) \delta _0 + \nu _t \\ \textrm{d}\nu _t(x)&= \frac{\sqrt{4t - (x - 1 - t)^2}}{2\pi x} \mathbbm {1}_{(x - 1 - t)^2 \le 4t} \textrm{d}x. \end{aligned} \end{aligned}$$

(2.17)

This allows one to show the folklore result (which we prove and extend to more general link states in Theorem 3.4) that upon an appropriate rescaling, the empirical distribution of the spectrum of $\rho _A$ converges to a Marchenko–Pastur distribution. This is in line with the case of a single random tensor, which precisely yields a Wishart matrix (see Sect. 3.1.1 for a brief introduction to these objects). In the first case, where there is a unique minimal cut, the entanglement spectrum of $\rho _A$ is flat, while, as we have seen, in the second case, the degeneracy gives rise to a non-trivial spectrum in the right scaling limit.

2.3 The Replica Trick for General Background States

In Eq. (2.13), we computed the result of the replica trick for the kth moment for a random tensor network state. We will also consider the more general setting where the link state is replaced by some arbitrary state $\phi _V$. In this setting, there need not be a graph structure, and the Hilbert space at each vertex $x \in V$ can be some arbitrary Hilbert space, rather than a tensor product of Hilbert spaces labeled by half-edges. We will refer to $\phi _V$ as a “background state” instead of a “link state” (as the interpretation of links along the edges does not necessarily make sense in this situation). That is, where before we had a link state

$$\begin{aligned} \vert \phi \rangle = \bigotimes _{e \in E} \vert \phi _e\rangle , \end{aligned}$$

we will now consider some arbitrary possibly mixed and subnormalized $\phi _V \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(V)$ in the tensor network construction. We can generalize Eq. (2.4) to also apply for general background states to obtain a state $\rho \in {\mathcal {P}}(V_\partial )$ given by

$$\begin{aligned} \rho = {{\,\textrm{tr}\,}}_{V_b}\left[\left( I_{V_{\partial }} \otimes \psi \right) \phi _V\right] \end{aligned}$$

(2.18)

where $\vert \psi \rangle $ is a tensor product of random states at the bulk vertices. If $\phi $ is pure, then so is $\rho $, since in that case

$$\begin{aligned} \vert \rho \rangle = \left( I_{V_{\partial }} \otimes \langle \psi \vert \right) \vert \phi \rangle . \end{aligned}$$

If $\phi $ is not pure, we can consider a purification $\phi _{VR} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(VR)$ and consider R as an additional boundary system; this leads to a random tensor network state $\rho _{V_{\partial }R}$ which is a purification of $\rho _{V_\partial }$. This setup is illustrated in Fig. 3, while formally very similar, the resulting state is no longer a PEPS tensor network state in general.

There are multiple reasons to also allow general background states. The first reason is of a technical nature: They are useful for estimates based on smooth entropies, which we discuss in Sect. 4. In this application, the full state on edges is still pure, but is no longer a tensor product of link states along the edges. A second motivation for considering general background states is that they can be used as a toy model for holographic systems where there is “bulk entropy” present. Finally, these states are closely related to protocols for the quantum information processing task of split transfer [26]. We comment on this connection in “Appendix B.”

Even for a general background state, a version of the replica trick still applies. Consider a boundary subsystem $A \subseteq V_{\partial }$ with corresponding boundary state $\rho _A$. Then, the computation in Eq. (2.10) is still valid, and we find

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}[\rho _A^k]&= \sum _{\{\pi _x\} \in {\mathcal {S}}_{A,\tau }} {{\,\textrm{tr}\,}}_V\left[\bigotimes _{x \in V} R_x(\pi _x) \phi _V^{\otimes k}\right] \end{aligned}$$

(2.19)

where $\tau = (1 2 \ldots k)$. However, Eq. (2.19) no longer has the interpretation of a spin model with local interactions.

For general background states, we will only need the replica trick for $k = 2$. Since $S_2$ has only two elements, each configuration of permutations is completely characterized by the domain $\Delta _A = \{x \in V \text { such that } \pi _x = \tau \}$. Because of the boundary conditions in ${\mathcal {S}}_{A,\tau }$, the collection of these sets coincides with C(A), and hence,

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}[\rho _A^2] = \sum _{\Delta _A \in C(A)} {{\,\textrm{tr}\,}}[\phi _{\Delta _A}^2] = \sum _{\Delta _A \in C(A)} {{\,\textrm{tr}\,}}[\phi ] 2^{-H_2(\Delta _A)_\phi } . \end{aligned}$$

(2.20)

Another useful fact is that by Eq. (2.9),

$$\begin{aligned} \mathbbm {E}\rho _{V_\partial } = \phi _{V_\partial }. \end{aligned}$$

(2.21)

We remark that if one only uses the $k = 2$ replica trick, one could also use tensors which are drawn from a projective 2-design, a distribution which produces tensors with the same first and second moments as uniformly random tensors of unit norm [35, 43]. An example of a projective 2-design is the set of uniformly random stabilizer states. For tensors $\vert \psi _x\rangle $ drawn from a projective 2-design of dimension $D_x$, it holds that

$$\begin{aligned} \mathbbm {E}\psi _x^{\otimes 2} = \frac{1}{D_x(D_x - 1)}I + \frac{1}{D_x(D_x - 1)}R_x(\tau ), \end{aligned}$$

and hence,

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}[\rho _A^2] = \frac{D_x}{D_x + 1} \sum _{\Delta _A \in C(A)} {{\,\textrm{tr}\,}}[\phi _{\Delta _A}^2], \end{aligned}$$

which is close to Eq. (2.20) for large $D_x$. Thus, it is not hard to see that all random tensor network results which only use the $k = 2$ replica trick are also valid for states with tensors drawn from projective 2-designs. This was already observed in [40], and random tensor networks with random stabilizer tensors were further studied in [57]. The results of Sect. 4 only use the $k=2$ replica trick and thus will extend to states with tensors drawn from projective 2-designs. This will not be true for the results in Sect. 3, which requires usage of the replica trick for all $k \in \mathbbm {N}$.

2.3.1 Normalization of Random Tensor Network States

One immediate consequence of the replica trick for $k=2$ is that the random tensor network state $\rho $ will be approximately normalized with high probability, so long as a mild condition on the background state is satisfied: The bulk needs to be connected, with sufficiently entangled edges. Let

$$\begin{aligned} \eta = \max _{\Delta \subseteq V_b, \Delta \ne \emptyset } {{\,\textrm{tr}\,}}[\phi _{\Delta }^2] = \max _{\Delta \subseteq V_b, \Delta \ne \emptyset } {{\,\textrm{tr}\,}}[\phi ] 2^{-H_2(\Delta )_\phi }. \end{aligned}$$

(2.22)

If the state has enough correlations along each cut (or more precisely, if $H_2(\Delta )_\phi $ is large for each $\Delta $), then $\eta $ is small. Concretely, if we consider a random tensor network state with maximally entangled link states of bond dimension D, we will have $\eta \le \frac{1}{D}$. We then have

Lemma 1.1

For any background state $\phi \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(V)$, with associated $\rho \in {\mathcal {P}}(V_\partial )$ as in Eq. (2.18), it holds that for any $\varepsilon > 0$

$$\begin{aligned} \Pr \bigl (\left|{{\,\textrm{tr}\,}}[\rho ] - {{\,\textrm{tr}\,}}[\phi ]\right| \ge \varepsilon \bigr ) \le 2^{|V_b|}\frac{\eta }{\varepsilon ^2} \end{aligned}$$

where $\eta $ is defined in Eq. (2.22).

Proof

This follows from a special case of Eq. (2.20). In this case, the empty cut contributes ${{\,\textrm{tr}\,}}[\phi ]^2,$ so we find

$$\begin{aligned} {{\,\textrm{Var}\,}}({{\,\textrm{tr}\,}}[\rho ]) = \mathbbm {E}\left|{{\,\textrm{tr}\,}}[\rho ] - {{\,\textrm{tr}\,}}[\phi ]\right|^2 = \mathbbm {E}\left|{{\,\textrm{tr}\,}}[\rho ]^2 - {{\,\textrm{tr}\,}}[\phi ]^2\right| \le 2^{V_b} \max _{\Delta \subseteq V_b, \Delta \ne \emptyset } {{\,\textrm{tr}\,}}[\phi _{\Delta }^2], \end{aligned}$$

where we have used the normalization of $\rho $ in expectation $\mathbbm {E}{{\,\textrm{tr}\,}}[\rho ] = {{\,\textrm{tr}\,}}[\phi ]$, as in Eq. (2.5). The result follows by an application of Chebyshev’s inequality. $\square $

We can improve this result by taking advantage of the fact that our random projectors are random Gaussian vectors, allowing us to use Gaussian concentration of measure rather than the Chebyshev’s inequality. For instance, using a concentration bound for Gaussian polynomials (see [9], Corollary 5.49) one can show that for any $\varepsilon \ge (\sqrt{2}e)^{2V_b} \eta $:

$$\begin{aligned} \Pr \bigl (\left|{{\,\textrm{tr}\,}}[\rho ] - {{\,\textrm{tr}\,}}[\phi ]\right| \ge \varepsilon \bigr ) \le \exp \left( -\frac{\left|V_b\right|}{2e}\varepsilon ^{\frac{1}{\left|V_b\right|}}\eta ^{-\frac{1}{\left|V_b\right|}}\right) , \end{aligned}$$

where $\eta $ is defined as in Eq. (2.22). We will not need this refinement.

3 Link States with Bounded Spectral Variation

In this section, we study random tensor network states with link states that have bounded spectral variation, meaning that there is an effective bond dimension D such that the Schmidt coefficients of the link state are of the order $\tfrac{1}{D}$.

We start by providing background material on random matrix theory and free probability, which is a key tool in the study of products of random matrices. In Sect. 3.2, we will precisely define the notion of bounded spectral variation and generalize the results in Sect. 2.2 for random tensor network states with maximally entangled link states to this wider class of link states. This leads to the main result of this section, Theorem 3.4, which shows that the asymptotic entanglement spectrum can be expressed in terms of a free product of distributions. We will see that the results are similar to the quantum gravity setup described in “Appendix A.3.” Finally, in Sect. 3.3, we investigate the entanglement negativity for random tensor network states with link states of bounded spectral variation.

3.1 Random Matrices, Free Probability, and Non-crossing Partitions

3.1.1 Random Matrix Theory and Wishart Matrices

We start by reviewing relevant concepts from probability and random matrix theory that are relevant for our analyses. This material can be found in any introduction to random matrix theory, e.g., [3, 15, 58].

A fundamental question in random matrix theory is as follows: Given a family of $n \times n$ matrices with entries selected according to some distribution, what is the asymptotic distribution of the eigenvalues as $n \rightarrow \infty $? This question has been extensively studied and in many cases has an elegant and concise answer. We discuss a basic example which is closely related to our purposes: Wishart matrices. Consider an $n \times m$ matrix X whose entries are drawn i.i.d. from a Gaussian distribution with mean zero and unit variance. The sample covariance matrix of X is the $n\times n$ matrix defined as

$$\begin{aligned} Y_{n,m} = \frac{1}{m}XX^T. \end{aligned}$$

(3.1)

Such random matrices are called (real) Wishart matrices and can be thought of as a sample second moment matrix (where one has m realizations of an n-dimensional random variable). One can also consider complex Wishart matrices: In this case, the entries of the $n \times m$ matrix X are complex i.i.d. standard (circularly symmetric) complex Gaussian random variables. We then let $Y_{n,m} = \frac{1}{m}XX^\dagger $. We would like to understand the spectrum of $Y_{n,m}$, and to that end, we consider the empirical distribution of the eigenvalues. This empirical distribution is itself random, depending on the particular realization of $Y_{n,m}$. To characterize the convergence, we recall that if $\{\mu _n\}_{n \in \mathbbm {N}}$ is a sequence of random finite measures on $\mathbbm {R}$, we say that the sequence $\mu _n$ converges weakly, in probability, to a finite measure $\mu $, if, for any bounded continuous function $f \in C_b(\mathbbm {R})$, it holds that for every $\varepsilon > 0$

$$\begin{aligned} \lim _{n \rightarrow \infty } \Pr \left( \left|\int f(x) \textrm{d}\mu _n(x) - \int f(x) \textrm{d}\mu (x) \right| \ge \varepsilon \right) = 0. \end{aligned}$$

The asymptotic distribution of the eigenvalues of Wishart matrices is known to obey the Marchenko–Pastur law (see, for instance, Theorem 3.6 and Theorem 3.7 in [15]):

Theorem 1.2

Consider (real or complex) Wishart matrices $Y_{n,m}$, and let

$$\begin{aligned} \mu _{n,m} = \frac{1}{n}\sum _{\lambda \in {{\,\textrm{spec}\,}}(Y_{n,m})}\delta _{\lambda } \end{aligned}$$

be the empirical distribution of its eigenvalue spectrum. Suppose that the ratio of dimensions n/m converges to a constant $t > 0$ as $n \rightarrow \infty $. Then, $\mu _{n,m}$ converges weakly, in probability, to the Marchenko–Pastur distribution ${{\,\textrm{MP}\,}}(t)$ with parameter $t > 0$, as defined in Eq. (2.17).

Generalizations to this result are possible. For example, one still has convergence if the entries of X are chosen according to non-Gaussian distributions with mean zero and unit variance. Also, one can prove weak convergence, almost surely (rather than just in probability); see [15].

If $Y_{n,m} = \frac{1}{m}XX^\dagger $ is a complex Wishart matrix, X can also be interpreted as a uniformly random pure quantum state on $\mathbbm {C}^n \otimes \mathbbm {C}^m$, and $Y_{n,m}$, up to normalization, as the reduced density matrix on $\mathbbm {C}^n$ [39]. Note that $\frac{1}{n}Y_{n,m}$ is normalized in expectation in the sense that $\mathbbm {E}\frac{1}{n} Y_{n,m} = 1$. So, complex Wishart matrices can be used as a model for the reduced state of a random bipartite quantum state, and this allows one to quantify the “typical entanglement” of a random state. Equivalently, in the tensor network setting, $\frac{1}{\sqrt{n}}X$ can be thought of as a random tensor network state with a single bulk vertex, two boundary vertices, and maximally entangled link states. We can then can interpret $\frac{1}{n}Y_{n,m}$ as the reduced density matrix on one of the boundary vertices. We will provide a generalization of Theorem 3.1 for the entanglement spectrum of random tensor network states in Theorem 3.4.

3.1.2 Free Probability

The topic of probability distributions in random matrix theory is closely related to free probability and, in particular, to the notion of the free product. We provide a brief introduction here; the material in this section is very standard, and we only review a few relevant aspects. For an extensive treatment, see, for instance, Chapter 5 in [3] or the books [52, 56, 58]. As we will see later, the free product will allow us to concisely formulate replica trick results involving multiple minimal cuts.

A non-commutative probability space is a pair $({\mathcal {A}}, \omega )$, where ${\mathcal {A}}$ is a $C^*$-algebra and $\omega $ is a state on ${\mathcal {A}}$. An element $a \in {\mathcal {A}}$ is called a non-commutative random variable. The key example to have in mind is the space of $n \times n$ random matrices, where the matrix entries are distributed according to some probability distribution, and $\omega (a) = \mathbbm {E}\frac{1}{n}{{\,\textrm{tr}\,}}[a]$ defines a tracial state. If $a \in {\mathcal {A}}$, the distribution (or law) $\mu _a$ of a is defined as a map on polynomials, which evaluates on a polynomial p as $\mu _a(p) = \omega (p(a))$. If a is self-adjoint, it has real spectrum and we can extend the domain of $\mu _a$ to all bounded continuous functions $f \in C_b(\mathbbm {R})$, using the functional calculus to define f(a) and letting $\mu _a(f) = \omega (f(a))$. In this case, we can identify $\mu _a$ with a distribution such that, for $f \in C_b(\mathbbm {R})$, we have $\mu _a(f) = \int f(x) \textrm{d}\mu _a(x)$. In particular, if a is an $n \times n$ self-adjoint random matrix, then $\mu _a(f) = \frac{1}{n}\mathbbm {E}\sum _{\lambda \in {{\,\textrm{spec}\,}}(a)} f(\lambda )$, and we may identify $\mu _a$ with the empirical measure of the eigenvalues of a. If ${\mathcal {A}}$ is a commutative algebra, these notions reduce to the usual notions of probability theory, where $\omega $ is the expectation.

We call a set of n non-commutative random variables $\{a_i\}$ on a non-commutative probability space $({\mathcal {A}},\omega )$ freely independent or just free if, for any set of $k \ge 2$ polynomials $\{p_j\}$, the variables satisfy

$$\begin{aligned} \omega (p_1(a_{i_1})\ldots p_k(a_{i_k})) = 0 \end{aligned}$$

whenever $\omega (p_m(a_{i_m})) = 0$ for all $1 \le m \le k$ and no two adjacent indices $i_m$ and $i_{m+1}$ for $1 \le m \le k-1$ are equal. One can see that two freely independent variables $a_1,a_2$ satisfy:

$$\begin{aligned} 0 = \omega ((a_1-\omega (a_1))(a_2-\omega (a_2))) = \omega (a_1a_2) - \omega (a_1)\omega (a_2), \end{aligned}$$

(3.2)

which, in the commutative case with random variables $x_1$, $x_2$, is the classical bivariate independence condition $\mathbbm {E}[x_1x_2] = \mathbbm {E}[x_1]\mathbbm {E}[x_2]$. The definition of free independence does not specialize to independence in the commutative case. (Commuting independent random variables are only free when they are constant.) However, the role of free independence is analogous to the role of classical independence for commuting random variables: It allows one to, in principle, compute the joint mixed moments of the variables.

We will be interested in the multiplicative free convolution or free product (there also exists an additive convolution or just free convolution) of distributions. Suppose a, b are non-commutative self-adjoint free random variables on $({\mathcal {A}},\omega )$ with distributions $\mu _a$ and $\mu _b$. Then, we denote the distribution of ab by $\mu _{ab} = \mu _a \boxtimes \mu _b$. Note that, generally, ab need not be self-adjoint. However, if $\omega $ is tracial (as in the random matrix case) and a is positive, the distribution of ab coincides with that of $\sqrt{a}b\sqrt{a}$ which is self-adjoint, and we can identify $\mu _{ab}$ with a distribution on $\mathbbm {R}$. If $\mu _a$ and $\mu _b$ are compactly supported distributions, then so is $\mu _a \boxtimes \mu _b$.

As a concrete example of the freeness and the free product, let $X_n$ and $Y_n$ be two families of random $n \times n$ positive diagonal matrices with uniformly bounded norm, such that their spectrum converges weakly to probability distributions $\mu $ and $\nu $, respectively. Let $U_n$ be a family of Haar random unitary $n \times n$ matrices. Then, as n goes to infinity, $X_n$ and $Y_n' = U_nY_nU_n^\dagger $ will be freely independent (so they are asymptotically free), and we would like to study their product. The product of positive matrices need not be self-adjoint, so we consider $Z_n = \sqrt{X_n} Y_n' \sqrt{X_n}$ which is a positive matrix. One may then show that the distribution of the spectrum of $Z_n$ weakly converges in probability to $\mu \boxtimes \nu $. See Corollary 5.4.11 in [3] for a precise statement and proof.

The free product may be analyzed using generating functions: Given a (non-commutative) random variable a with distribution $\mu _a$, let

$$\begin{aligned} m_{a,k} = \int x^k \textrm{d}\mu _a(x) \end{aligned}$$

be the kth moment of $\mu _a$. Then, the moment-generating function is the formal power series

$$\begin{aligned} M_{\mu _a}(z) = \sum _{k=1}^\infty m_{a,k} z^k. \end{aligned}$$

(3.3)

We define the S-transform to be the formal power series

$$\begin{aligned} S_{\mu _a}(z) = \frac{1 + z}{z}M_{a}^{-1}(z), \end{aligned}$$

where $M_{a}^{-1}(z)$ is the power series corresponding to the formal inverse of $M_{\mu }(z)$ under composition, which is well defined as long as $m_{a,1} \ne 0$. For compactly supported distributions, the moment-generating function, and hence the S-transform, uniquely determines the distribution.

If a and b are non-commutative self-adjoint free random variables, then

$$\begin{aligned} S_{\mu _{a} \boxtimes \mu _{b}}(z) = S_{\mu _{ab}}(z) = S_{\mu _a}(z)S_{\mu _b}(z). \end{aligned}$$

(3.4)

This also provides a completely combinatorial interpretation of the free product, without reference to the associated non-commutative probability spaces. That is, given compactly supported distributions $\mu $ and $\nu $, we can define $\mu \boxtimes \nu $ by Eq. (3.4): It is the compactly supported distribution with moments prescribed by $S_{\mu _{a} \boxtimes \mu _{b}}(z)$, and hence, $M_{\mu _{a} \boxtimes \mu _{b}}(z)$. The free product is commutative and associative.

As an example, we compute the S-transform of the Marchenko–Pastur distribution $\mu \sim {{\,\textrm{MP}\,}}(1)$. The distribution is given by

$$\begin{aligned} \textrm{d}\mu (x) = \frac{1}{2\pi }\sqrt{4x^{-1} - 1} \textrm{d}x. \end{aligned}$$

The moments can be computed directly:

$$\begin{aligned} m_k = \sum _{i=0}^{k-1}\frac{1}{i+1}\left( {\begin{array}{c}k\\ i\end{array}}\right) \left( {\begin{array}{c}k-1\\ i\end{array}}\right) \end{aligned}$$

(3.5)

After some work, one can show that the moments above lead to a closed-form moment-generating function

$$\begin{aligned} M(z) = \frac{2z - 1 - \sqrt{1 - 4z}}{2z}. \end{aligned}$$

One may then invert the expression and obtain the S-transform

$$\begin{aligned} S(z) = \frac{1}{1+z}. \end{aligned}$$

Similarly, for the Marchenko–Pastur distribution MP(t) with parameter t, which has distribution as given in Eq. (2.17), we find that

$$\begin{aligned} S(z) = \frac{1}{t+z}. \end{aligned}$$

See, for instance, [12].

3.1.3 Non-crossing Partitions

Given $k \in \mathbbm {N}$, let NC(k) denote the set of non-crossing partitions of [k]. A non-crossing partition of [k] is a partition $[k] = X_1 \sqcup \ldots \sqcup X_m$ which is such that if $i < j \in X_\alpha $, then there are no $k, l \in X_{\beta }$ for $\beta \ne \alpha $ with $k< i< l < j$ or $i< k< j < l$. To any non-crossing partition, we associate a permutation $\pi \in S_k$ by mapping each subset $\{i_1, \ldots , i_l\}$ to the cycle $(i_1,\ldots , i_l)$ with $i_1< \cdots < i_l$. In a slight abuse of notation, we will write $\pi \in NC(k)$. For any $\pi \in S_k$, and for a sequence of numbers $f_k$ for $k = 1,2, \ldots $, we write

$$\begin{aligned} f_{\pi } = \prod _{l \in C(\pi )} f_l \end{aligned}$$

(3.6)

where $C(\pi )$ is the cycle type of $\pi $. We will need the following result, which is a straightforward consequence of the combinatorics of the S-transform.

Theorem 1.3

Consider compactly supported probability distributions $\mu , \nu , \eta $. Suppose that the moments of $\eta $ are given by

$$\begin{aligned} m_k^\eta = \sum _{\pi \in NC(k)} m^\mu _{\pi } m^\nu _{\pi ^{-1} \tau _k} \end{aligned}$$

where $\tau _k = (1 2 \ldots k)$ is the full cycle. Then,

$$\begin{aligned} \eta = {{\,\textrm{MP}\,}}(1) \boxtimes \mu \boxtimes \nu . \end{aligned}$$

Proof

We let ${\mathcal {F}}$ be the transformation that sends a formal power series f(z) to the power series $\frac{1}{z}f^{-1}(z)$. This is such that for some distribution $\mu $, the S-transform is given by $S_{\mu }(z) = (1 + z) {\mathcal {F}}(M_{\mu })(z)$. Moreover, given power series $f(z) = \sum _k f_k z^k$ and $g(z) = \sum _k g_k z^k$, define a convolution operation $\circledast $ by

$$\begin{aligned} (f \circledast g) (z) = \sum _k \left( \sum _{\pi \in NC(k)} f_{\pi } g_{\pi ^{-1} \tau _k}\right) z^k \end{aligned}$$

where $\tau _k$ is the full cycle in $S_k$. Then, Theorem 18.14 in [56] states that for any two f and g with $f_1 \ne 0$ and $g_1\ne 0$, it holds that

$$\begin{aligned} {\mathcal {F}}(f \circledast g)(z) = {\mathcal {F}}(f)(z){\mathcal {F}}(g)(z). \end{aligned}$$

Then, the S-transform of $\eta $ can be written:

$$\begin{aligned} S_\eta (z) = (1 + z) {\mathcal {F}}(M_\eta )(z) = (1 + z) {\mathcal {F}}(M_\mu )(z) {\mathcal {F}}(M_\nu )(z) = \frac{1}{1 + z} S_\mu (z)S_\nu (z). \end{aligned}$$

This implies the desired result, as the S-transform of ${{\,\textrm{MP}\,}}(1)$ is given by $\frac{1}{1+z}$, and the S-transform uniquely determines a compactly supported distribution. $\square $

We remark briefly that free independence can equivalently be formulated in terms of the vanishing of free cumulants, which are themselves defined in terms of sums over non-crossing partitions. We refer the interested reader to any of the previously cited references for a more in-depth discussion on the role of non-crossing partitions in free probability. For our purposes, the fact that non-crossing partitions are intimately related to free independence will allow us to later phrase random tensor network results in terms of free probability.

3.2 Entanglement Spectrum of Random Tensor Network States as a Free Product

We now return to studying random tensor network states. Consider a family of states in ${\mathcal {P}}_{\scriptscriptstyle {=}}(V)$ composed of the tensor product of link states $\vert \phi _e\rangle $ along the edges $e \in E$ as in Eq. (2.1), and assume that along each edge, the bond dimensions scale with a parameter D, so $D_e = \Theta (D)$. Our key assumption is that the link states have bounded spectral variation—by this we mean that the empirical distribution of the rescaled entanglement spectrum of the link states

$$\begin{aligned} \mu ^{(D)}_{e} := \sum _{i=1}^{D_e} \frac{1}{D_e} \delta _{D_e\lambda _{e,i}} \end{aligned}$$

(3.7)

has all moments converging to the moments $m_{e,k}$ of a compactly supported probability distribution $\mu _{e}$, as D goes to infinity. We assume that the link states are normalized, so $m_{e,1} = 1$. This condition implies that, up to a vanishing fraction as $D\rightarrow \infty $, the elements of the entanglement spectrum of the link state are of order $D^{-1}$.

For a minimal cut $\gamma _A$, let $\mu ^{(D)}_{\gamma _A}$ be the distribution for the spectrum of the tensor product of the link states in $\gamma _A$:

$$\begin{aligned} \mu ^{(D)}_{\gamma _A} = \bigotimes _{e \in \gamma _A} \mu ^{(D)}_{e} = \frac{1}{D_{\gamma _A}}\sum _{\{i_{e}\}} \delta _{D_{\gamma _A} \prod _{e \in \gamma _A}\lambda _{e,i_e}} \end{aligned}$$

where $i_e = 1,\ldots ,D_e$ and $D_{\gamma _A} = \prod _{e \in \gamma _A} D_e$. We define the tensor product of distributions as follows: If $X_1$ and $X_2$ are independent real valued random variables with distributions $\mu _{X_1}$ and $\mu _{X_2}$, then $\mu _{X_1} \otimes \mu _{X_2}$ is defined as the joint distribution of $(X_1, X_2)$. The distribution $\mu ^{(D)}_{\gamma _A}$ has kth moment given by $m^{(D)}_{\gamma _A,k} = \prod _{e \in \gamma _A} m^{(D)}_{e,k}$, and we can see that $m^{(D)}_{\gamma _A,k}$ converges to $m_{\gamma _A,k}$, the moments of the distribution

$$\begin{aligned} \mu _{\gamma _A} := \bigotimes _{e \in \gamma _A} \mu _{e}. \end{aligned}$$

Let ${{\,\textrm{spec}\,}}(\rho _A) = \{\lambda _{A,i}\}$. (Recall that ${{\,\textrm{spec}\,}}(\rho _A)$ is ordered in non-increasing order.) Let $\gamma _A$ be a cut for A. By a standard argument, the number of nonzero eigenvalues of $\rho _A$ (that is, ${{\,\textrm{rank}\,}}(\rho _A)$) is upper bounded by $D_{\gamma _A}$. If $\gamma _A$ is the unique minimal cut, then we define

$$\begin{aligned} \mu ^{(D)}_{A} := \frac{1}{D_{\gamma _A}} \sum _{i=1}^{D_{\gamma _A}} \delta _{D_{\gamma _A} \lambda _{A,i}}. \end{aligned}$$

(3.8)

If there are multiple minimal cuts, it is ambiguous which $\gamma _A$, and hence, which $D_{\gamma _A}$, we should pick; we choose the cut for which $D_{\gamma _A}$ is minimal in Eq. (3.8), and we will denote this minimal cut by $\gamma _{A,1}$. The moments of $\mu ^{(D)}_{A}$ are given by

$$\begin{aligned} m^{(D)}_{A,k} := \int z^k \textrm{d}\mu _{A}(z) = D_{\gamma _A}^{k-1} \sum _{i=1}^{D_{\gamma _A}} (\lambda _{A,i})^k. \end{aligned}$$

Note that the distribution $\mu ^{(D)}_A$ is random, and correspondingly, the moments $m^{(D)}_{A,k}$ are random variables. In contrast, the moments $m^{(D)}_{e,k}$ and $m^{(D)}_{\gamma _A,k}$ are numbers depending only on the bond dimension.

The theorem we want to prove will follow straightforwardly from a key intermediate result: As D goes to infinity, all the moments of the boundary distribution $\mu _A^{(D)}$ converge to the moments of $\mu _{\gamma _A}$. We use the notation in Eq. (3.6) to write expressions like

$$\begin{aligned} m_{\gamma _A,\pi } = \prod _{l \in C(\pi )} m_{\gamma _A,l} \end{aligned}$$

for a permutation $\pi \in S_k$. We will then apply the method of moments to show that convergence of moments implies convergence in distribution. As a remark on notation, in the error bounds in both the current section and Sect. 4, when we use ${\mathcal {O}}$-notation, the constants may depend on the graph underlying the tensor network. (Typically our bounds scale as $2^{\left|V_b\right|}$, where $V_b$ is the set of bulk vertices.)

Proposition 1.4

If there exists a unique minimal cut $\gamma _A$ for A, then

$$\begin{aligned} \lim _{D \rightarrow \infty } \mathbbm {E}m^{(D)}_{A,k} = m_{\gamma _A,k}. \end{aligned}$$

(3.9)

If there exist exactly two minimal cuts $\gamma _{A,1}$ and $\gamma _{A,2}$, which do not intersect (so $\gamma _{A,1} \cap \gamma _{A,2} = \emptyset $) and for which $\frac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}}$ converges to a constant $t \le 1$, then

$$\begin{aligned} \lim _{D \rightarrow \infty } \mathbbm {E}m^{(D)}_{A,k} = \sum _{\pi \in NC(k)} t^{d(\pi ,\textrm{id})} m_{\gamma _{A,1},\tau ^{-1}\pi } m_{\gamma _{A,2},\pi }. \end{aligned}$$

(3.10)

Moreover, in both cases the variance goes to zero as D goes to infinity: for every k

$$\begin{aligned} \mathbbm {E}\left[\left( m^{(D)}_{A,k} - \mathbbm {E}\left[m^{(D)}_{A,k}\right]\right) ^2\right] = {\mathcal {O}}\left( \frac{1}{D}\right) . \end{aligned}$$

(3.11)

Proof

We first provide a sketch of the proof. It proceeds via the following steps:

1.
Write the expectation of the moments of $\mu _A^{(D)}$ as the partition function for a classical spin model, as in Sect. 2.1.
2.
Show that the contributions from terms of the form given in the statement of the proposition dominate the partition function by carefully tracking the powers of D, and showing that all other contributions are suppressed polynomially in D.
3.
Show that the variance of the moments vanishes in the limit $D \rightarrow \infty $ by direct computation.

We begin with Step 1. First, we observe that the kth moment of $\mu _A^{(D)}$ is given by $m^{(D)}_{A,k} = D_{\gamma _A}^{k-1} {{\,\textrm{tr}\,}}\left[\rho _A^k\right]$. Consider the expression in Eq. (2.14) for the replica trick with permutation $\pi $ on A:

$$\begin{aligned} Z_{k, \pi } := \mathbbm {E}{{\,\textrm{tr}\,}}\left[R_A(\pi ) \rho _A^{\otimes k}\right] = \sum _{\{\pi _x\} \in {\mathcal {S}}_{A,\pi }} \prod _{e = (xy) \in E} \prod _{l \in C(\pi _x^{-1} \pi _y)} {{\,\textrm{tr}\,}}\left[\phi _{e,x}^l\right]. \end{aligned}$$

(3.12)

Recall that the set ${\mathcal {S}}_{A,\pi }$, as defined in Eq. (2.11), consists of assignments of permutations to each $x \in V$, subject to $\pi _x = \pi $ for $x \in A$ and $\pi _x = \textrm{id}$ for $x \in {\bar{A}}$. As in Eq. (2.12), if $\pi = \tau $, then we indeed have $Z_{k, \tau } = \mathbbm {E}{{\,\textrm{tr}\,}}\left[\rho _A^k\right]$, so

$$\begin{aligned} \mathbbm {E}m^{(D)}_{A,k} = D_{\gamma _A}^{k-1}Z_{k, \tau }. \end{aligned}$$

(3.13)

On the other hand, if we let $k = 2n$ and $\pi = {\tilde{\tau }} = (1 2 \ldots n)(n+1 \, n+2 \ldots 2n)$, then $Z_{k, \pi } = \mathbbm {E}\left[{{\,\textrm{tr}\,}}\left[\rho _A^n\right]^2\right]$, and hence,

$$\begin{aligned} \mathbbm {E}\left[\left( m^{(D)}_{A,n}\right) ^2\right] = D_{\gamma _A}^{2n-2}Z_{k, {{\tilde{\tau }}}}. \end{aligned}$$

(3.14)

Recall that $m^{(D)}_{e,l} = D_e^{l-1}{{\,\textrm{tr}\,}}\left[\phi _{e,x}^l\right]$, and write

$$\begin{aligned} Z_{k, \pi } = \sum _{\{\pi _x\} \in {\mathcal {S}}_{A,\pi }} Z_k(\{\pi _x\}) \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} Z_{k}(\{\pi _x\})&:= \prod _{e = (xy) \in E} \prod _{l \in C(\pi _x^{-1} \pi _y)} {{\,\textrm{tr}\,}}\left[\phi _{e,x}^l\right]\\&= \prod _{e = (xy) \in E} D_e^{\left|C(\pi _x^{-1} \pi _y)\right| - k}\prod _{l \in C(\pi _x^{-1} \pi _y)} m^{(D)}_{e,l} \\&= \prod _{e = (xy) \in E} D_e^{-d(\pi _x,\pi _y)} m^{(D)}_{e,\pi _x^{-1} \pi _y}. \end{aligned} \end{aligned}$$

(3.15)

This accomplishes Step 1: We have recast the problem of computing moments into a question of computing a partition function for a classical spin model with fixed boundary conditions.

For Step 2, we want to show that the dominant contribution(s) to $Z_{k, \pi }$ as D goes to infinity are those given in the statement of the proposition. This will simply be a matter of checking powers of D, and using the triangle inequality property of the Cayley distance. If $\Gamma _{A}$ is the unique minimal cut, then we let $\pi ^{\min }_x = \pi $ for $x \in \Gamma _A$ and $\pi ^{\min }_x = \textrm{id}$ for $x \in V {\setminus } \Gamma _A$, and we have

$$\begin{aligned} Z_k(\{\pi ^{\min }_x\})&= D_{\gamma _A}^{-d(\pi ,\textrm{id})} \prod _{e \in \gamma _A} m^{(D)}_{e,\pi } = D_{\gamma _A}^{-d(\pi ,\textrm{id})} m^{(D)}_{\gamma _A,\pi }. \end{aligned}$$

(3.16)

If there are exactly two minimal cuts $\Gamma _{A,1} \subset \Gamma _{A,2}$, we let $V = V_1 \sqcup V_2 \sqcup V_3$, with $V_1 = \Gamma _{A,1}$, $V_2 = \Gamma _{A,2} \cap \Gamma _{A,1}^c$ and $V_3 = \Gamma _{A,2}^c$. Now, consider the permutations $\sigma \in S_k$ that are on a geodesic between $\pi $ and $\textrm{id}$ (recall this implies $d(\pi ,\sigma ) + d(\sigma , \textrm{id}) = d(\pi ,\textrm{id})$), and consider the configuration given by $\pi ^{\sigma }_x = \pi $ for $x \in V_1$, $\pi ^{\sigma }_x = \sigma $ for $x \in V_2$, and $\pi ^{\sigma }_x = \textrm{id}$ for $x \in V_3$. By hypothesis, $\gamma _{A,1}$ and $\gamma _{A,2}$ do not intersect, and hence, the edges in each cut are distinct. Then, this configuration has weight

$$\begin{aligned} Z_k(\{\pi ^{\sigma }_x\})&=\prod _{e_1 \in \gamma _{A,1}} D_{e_1}^{-d(\pi ,\sigma )} \prod _{l_1 \in C(\pi ^{-1}\sigma )} m^{(D)}_{e_1,l_1} \prod _{e_2 \in \gamma _{A,2}} D_{e_2}^{-d(\sigma ,\textrm{id})} \prod _{l_2 \in C(\sigma )} m^{(D)}_{e,l}\nonumber \\&= D_{\gamma _{A,1}}^{-d(\pi ,\textrm{id})} \left( \tfrac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}}\right) ^{d(\sigma ,\textrm{id})} \prod _{e_1 \in \gamma _{A,1}} m^{(D)}_{e_1,\pi ^{-1}\sigma } \prod _{e_2 \in \gamma _{A,2}} m^{(D)}_{e_2,\sigma }\nonumber \\&= D_{\gamma _{A,1}}^{-d(\pi ,\textrm{id})} \left( \tfrac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}}\right) ^{d(\sigma ,\textrm{id})} m^{(D)}_{\gamma _{A,1},\pi ^{-1}\sigma } m^{(D)}_{\gamma _{A,2},\sigma }, \end{aligned}$$

(3.17)

where $D_{\gamma _A,1}/D_{\gamma _A,2}$ converges to t, by assumption. Now, to show that these configurations yield the dominant contributions, we will need to use that $D_e = \Theta (D)$, so let us write $\frac{D}{D_e} = C_e^{(D)} = \Theta (1)$. Then for general configurations labeled by $\pi _x$, we may rewrite Eq. (3.15) as

$$\begin{aligned} Z_k(\{\pi _x\}) = \prod _{e = (xy) \in E} D^{-d(\pi _x,\pi _y)} (C^{(D)}_e)^{d(\pi _x,\pi _y)} m^{(D)}_{e,\pi _x^{-1} \pi _y}. \end{aligned}$$

The configurations we claimed to be dominant satisfy $Z_k(\{\pi \}) = \Theta (D^{-m(A)d(\pi ,\textrm{id})})$, where we recall that m(A) is the size of a minimal cut for A. Now, we will show that all other configurations satisfy $Z_k(\{\pi \}) = {\mathcal {O}}(D^{-m(A)d(\pi ,\textrm{id}) - 1})$. To this end, consider some arbitrary configuration $\{\pi _x\} \in {\mathcal {S}}_{A,\pi }$. Let P be a maximal set of edge-disjoint paths in G from A to ${\bar{A}}$. It is a well-known fact that such a set has size m(A), by the max-flow min-cut theorem. Let

$$\begin{aligned} C_k^{(D)} := \left( \max _{e \in E, l = 1, \ldots , k} (C^{(D)}_e)^{l-1}\right) \left( \max _{e \in E, \pi \in S_k} m^{(D)}_{e,\pi }\right) . \end{aligned}$$

Then, we may bound

$$\begin{aligned} Z_k(\{\pi _x\})&\le (C_k^{(D)})^{\left|E\right|} \prod _{e = (xy) \in E} D^{-d(\pi _x, \pi _y)} \le (C_k^{(D)})^{\left|E\right|} \prod _{p \in P} D^{- \sum _{e = (xy) \in p}d(\pi _x, \pi _y)}. \end{aligned}$$

(3.18)

The first inequality is clear from the definition of $C_k^{(D)}$, and in the second inequality, we simply restrict to a subset of the edges we multiply over. Note that $C_k^{(D)} = {\mathcal {O}}(1)$. Then, by the triangle inequality for the Cayley distance d, it holds that

$$\begin{aligned} \sum _{e = (xy) \in p}d(\pi _x, \pi _y) \ge d(\pi ,\textrm{id}) \end{aligned}$$

with equality if and only if the only edges (xy) for which $\pi _x \ne \pi _y$ are on a path in P, and each of the paths is a geodesic. Then, we conclude

$$\begin{aligned} Z_k(\{\pi _x\}) \le C_k^{(D)} \prod _{p \in P} D^{- d(\pi ,\textrm{id})} = C_k^{(D)} D^{- m(A)d(\pi ,\textrm{id})}, \end{aligned}$$

and we see that the weight of every configuration can be bounded by the product of a ${\mathcal {O}}(1)$ number and a polynomial in D.

Now, as promised, we show that if $\{\pi _x\}$ is not one of the minimal configurations described above, we actually have

$$\begin{aligned} Z_k(\{\pi _x\}) = {\mathcal {O}}(D^{- m(A)d(\pi ,\textrm{id}) - 1}). \end{aligned}$$

(3.19)

To see this, we rewrite the triangle inequality for the Cayley distance as:

$$\begin{aligned} \prod _{e = (xy) \in E} D^{-d(\pi _x, \pi _y)} \le \prod _{p \in P} D^{- \sum _{e = (xy) \in p}d(\pi _x, \pi _y)} \le D^{- m(A)d(\pi ,\textrm{id})} \end{aligned}$$

(3.20)

with equality if and only if the $\pi _x$ are on a geodesic path in P. We now show that this is satisfied only for the configurations we claimed to be minimal. Assume that $\{\pi _x\} \in {\mathcal {S}}_{A,\pi }$ is such that the inequalities in Eq. (3.20) are equalities and let

$$\begin{aligned} \Delta _n = \{x \in V \text { such that } d(\pi _x, \pi ) \le n \}. \end{aligned}$$

Then, $\Delta _n \in C(A)$ for $0 \le n < d(\pi ,\textrm{id})$, and we denote by $\delta _n$ the associated set of edges crossing the cut. Each edge $(xy) \in \delta _n$ must be such that $\pi _x \ne \pi _y$, so it must be on a path in P, and because the permutations are geodesics along the paths, they must be on different paths. Hence $\left|\delta _n\right| \le \left|P\right| = m(A)$, implying each $\Delta _n$ is a minimal cut. This immediately implies the claim if there is a unique minimal cut, since we must have $\Delta _{d(\pi ,\textrm{id}) - 1} = \Delta _{0} = \Gamma _A$. If there are exactly two minimal cuts, then we must have $\pi _x = \pi $ for $x \in V_1$, $\pi _x = \textrm{id}$ for $x \in V_3$, and there must be some l such that for all $x \in V_2$ we have $d(\pi ,\pi _x) = l$ and $d(\pi _x,\textrm{id}) = d(\pi ,\textrm{id}) - l$. Then in order to have equality in Eq. (3.20), we must have that for all $x \in V_2$, $\pi _x$ equals some fixed permutation $\sigma $, because the assumption of having exactly two cuts implies that $V_2$ is connected, and we must have $d(\pi _x, \pi _y) = 0$ for all $(xy) \in E$ with $x, y\in V_2$. This proves Eq. (3.19).

In conclusion, if there is a unique minimal cut, then by Eqs. (3.16) and (3.19), we find

$$\begin{aligned} Z_{k,\pi } = D_{\gamma _A}^{-d(\pi ,\textrm{id})} m^{(D)}_{\gamma _A,\pi } + {\mathcal {O}}(D^{-m(A)d(\pi ,\textrm{id}) - 1}), \end{aligned}$$

(3.21)

and if there are exactly two (non-intersecting) cuts, then by Eqs. (3.17) and (3.19), we find

$$\begin{aligned} Z_{k,\pi }&= \sum _{\sigma , \, d(\pi ,\sigma ) + d(\sigma ,\textrm{id}) = d(\pi ,\textrm{id})} D_{\gamma _{A,1}}^{-d(\pi ,\textrm{id})} \left( \tfrac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}}\right) ^{d(\sigma ,\textrm{id})} m^{(D)}_{\gamma _{A,1},\pi ^{-1}\sigma } m^{(D)}_{\gamma _{A,2},\sigma } \nonumber \\&\quad + {\mathcal {O}}(D^{-m(A)d(\pi ,\textrm{id}) - 1}). \end{aligned}$$

(3.22)

Finally, we set $\pi = \tau $ for the full cycle $\tau $ and we use Eq. (3.13). For a unique minimal cut $\gamma _A$, by Eq. (3.21)

$$\begin{aligned} \begin{aligned} \mathbbm {E}m_{\gamma _A,k}&= D_{\gamma _A}^{k-1} Z_{k, \tau }\\&= m^{(D)}_{\gamma _A,\pi } + {\mathcal {O}}( D_{\gamma _A}^{k-1} D^{-m(A)(k-1) - 1})\\&= m^{(D)}_{\gamma _A,k} + {\mathcal {O}}\left( \frac{1}{D}\right) . \end{aligned} \end{aligned}$$

(3.23)

using $d(\tau ,\textrm{id}) = k-1$ and $D_{\gamma _A} = \Theta (D^{m(A)})$. This proves Eq. (3.9) as $m^{(D)}_{\gamma _A,k}$ converges to $m_{\gamma _A,k}$.

For two non-intersecting minimal cuts, we saw that the dominant contribution is due to configurations $\{\pi ^\sigma _x\}$ for $\sigma $ on a geodesic between $\tau $ and $\textrm{id}$. Then, applying the observation that $\sigma $ is on such a geodesic if and only if $\sigma $ is a non-crossing partition similarly yields that by Eq. (3.22)

$$\begin{aligned} \begin{aligned} \mathbbm {E}m_{\gamma _A,k}&= D_{\gamma _A}^{k-1} Z_{k, \tau }\\&= D_{\gamma _A}^{k-1}\sum _{\sigma \in NC(k)} \left( \tfrac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}}\right) ^{d(\sigma ,\textrm{id})} m^{(D)}_{\gamma _{A,1},\pi ^{-1}\sigma } m^{(D)}_{\gamma _{A,2},\sigma } \\ {}&\quad + {\mathcal {O}}( D_{\gamma _A}^{k-1} D^{-m(A)(k-1) - 1})\\&= \sum _{\sigma \in NC(k)}\left( \tfrac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}}\right) ^{d(\sigma ,\textrm{id})} m^{(D)}_{\gamma _{A,1},\tau ^{-1}\sigma } m^{(D)}_{\gamma _{A,2},\sigma } + {\mathcal {O}}\left( \frac{1}{D}\right) . \end{aligned} \end{aligned}$$

(3.24)

Since $m^{(D)}_{\gamma _{A,1},\tau ^{-1}\sigma } \rightarrow m_{\gamma _{A,1},\tau ^{-1}\sigma }$, $m^{(D)}_{\gamma _{A,2},\sigma } \rightarrow m_{\gamma _{A,2},\sigma }$ and $D_{\gamma _{A,1}}/ D_{\gamma _{A,2}} \rightarrow t$, this proves Eq. (3.10).

This accomplishes Step 2: We have shown that the configurations $\{\pi ^{\min }_x\}$ (in case of a unique minimal cut for A) and $\{\pi ^{\sigma }_x\}$ (in case there are exactly two non-intersecting minimal cuts for A) dominate in the computation of the expectation of the kth moment of $\mu _A$ in terms of powers of D.

We complete the proof by showing that the variance of $m_{A,k}$ vanishes as $D \rightarrow \infty $. We use the observation in Eq. (3.14), applying the analysis of $Z_{2k,\pi }$ to the case where $\pi = {\tilde{\tau }} = (1 2 \ldots k)(k+1 \, k+2 \ldots 2k)$. If there is a unique minimal cut, then using Eq. (3.21) and the fact that $d({\tilde{\tau }}, \textrm{id}) = 2k - 2$, we find

$$\begin{aligned} \mathbbm {E}\left[\left( m^{(D)}_{A,k}\right) ^2\right]&= D_{\gamma _A}^{2k-2} Z_{2k, {\tilde{\tau }}} = \prod _{l \in C({\tilde{\tau }})} m^{(D)}_{\gamma _A,l} + {\mathcal {O}}(D_{\gamma _A}^{2k-2} D^{-m(A)(2k-2) - 1}) \\&= (m^{(D)}_{\gamma _A,k})^2 + {\mathcal {O}}\left( \frac{1}{D}\right) . \end{aligned}$$

By Eq. (3.23), we know that $(\mathbbm {E}m^{(D)}_{A,k})^2 = (m^{(D)}_{\gamma _A,k} + {\mathcal {O}}(\frac{1}{D}))^2$, and we conclude that the variance obeys

$$\begin{aligned} \mathbbm {E}\left[\left( m^{(D)}_{A,k} - \mathbbm {E}\left[m^{(D)}_{A,k}\right]\right) ^2\right] = \mathbbm {E}\left[(m^{(D)}_{A,k})^2\right] - \left( \mathbbm {E}m^{(D)}_{A,k}\right) ^2 = {\mathcal {O}}\left( \frac{1}{D}\right) . \end{aligned}$$

For the case with exactly two minimal cuts, a similar argument holds. Here, the key observation is that $\sigma $ is on a geodesic between $\tilde{\tau }$ and $\textrm{id}$ if and only if $\sigma = \sigma _1 \sigma _2$ where $\sigma _1$ is on a geodesic between $(1 2 \ldots k)$ and $\textrm{id}$ and $\sigma _2$ is on a geodesic between $(k+1 \, k+2 \ldots 2k)$ and $\textrm{id}$. Using Eq. (3.22), this implies

$$\begin{aligned} \mathbbm {E}(m^{(D)}_{A,k})^2&= D_{\gamma _A}^{2k-2} Z_{2k, {{\tilde{\tau }}}}\\&=\sum _{\sigma _1,\sigma _2 \in NC(k)}\left( \tfrac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}}\right) ^{d(\sigma _1,\textrm{id}) +d(\sigma _2,\textrm{id}) } m^{(D)}_{\gamma _{A,1},\tau ^{-1}\sigma _1}\\ {}&\quad m^{(D)}_{\gamma _{A,1},\tau ^{-1}\sigma _2} m^{(D)}_{\gamma _{A,2},\sigma _1} m^{(D)}_{\gamma _{A,2},\sigma _2} + {\mathcal {O}}\left( \frac{1}{D}\right) \\&= \left( \sum _{\sigma \in NC(k)}\left( \tfrac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}}\right) ^{d(\sigma ,\textrm{id})} m^{(D)}_{\gamma _{A,1},\tau ^{-1}\sigma } m^{(D)}_{\gamma _{A,2},\sigma } \right) ^2 + {\mathcal {O}}\left( \frac{1}{D}\right) . \end{aligned}$$

By Eq. (3.24), we see that this coincides with $(\mathbbm {E}m^{(D)}_{A,k})^2$, up to ${\mathcal {O}}(\tfrac{1}{D})$, and hence, Eq. (3.11) holds. $\square $

We now have the ingredients to prove that the entanglement spectrum of random tensor networks with link states with bounded spectral variation can be written in a simple fashion. We will use the method of moments to translate the above result on convergence of moments to convergence in distribution. The basic statement is that, given certain conditions on the distributions in question, if the moments of a sequence of distribution $\mu _n$ converge to those of $\mu $, then $\mu _n \Rightarrow \mu $—see, for instance, Theorem 30.8 in [13].

The method of moments is valid, so long as a distribution $\mu $ is completely determined by its moments. This occurs if, for all k, the kth moment $m_{\mu ,k}$ is bounded as

$$\begin{aligned} m_{\mu ,k} \le AB^kk! \end{aligned}$$

(3.25)

for constants A, B independent of k. If the distributions have compact support, as in Proposition 3.3, then this condition is satisfied.^{Footnote 2}

Now that we have established the convergence of moments in Proposition 3.3, we have our main result of the (conditional) convergence in distribution. As in Proposition 3.3, we consider a family of random tensor network states with link states with bounded spectral variation with increasing D, as defined in the beginning of this section.

Theorem 1.5

If there exists a unique minimal cut $\gamma _A$ for A, then $\mu _A^{(D)} \Rightarrow \mu _{\gamma _A}$, in probability, as $D \rightarrow \infty $. If there exist exactly two minimal cuts $\gamma _{A,1}$ and $\gamma _{A,2}$, which do not intersect and for which $\lim _{D \rightarrow \infty } \frac{D_{\gamma _{A,1}}}{D_{\gamma _{A,2}}} = t \le 1$, then $\mu ^{(D)}_A \Rightarrow {{\,\textrm{MP}\,}}(1) \boxtimes \mu _{\gamma _{A,1}} \boxtimes \mu _{\gamma _{A,2}}(t)$, in probability, where $\mu _{\gamma _{A,2}}(t) = (1 - t)\delta _0 + t\mu _{\gamma _{A,2}}$

Proof

It is straightforward to see that the kth moment of $\mu _{\gamma _{A,2}}(t)$ is given by $t^{k-1} m_{\gamma _{A,2},k}$, and then, the result follows immediately from Proposition 3.3, Theorem 3.2, and the method of moments. Because we assumed that for any minimal cut $\gamma _A$ for A, the limiting distributions $\mu _{\gamma _A}$ are compactly supported, they are uniquely determined by their moments. Hence, the method of moments is valid, and the convergence of moments implies convergence in distribution. $\square $

Remark 3.5

In Theorem 3.4, we assumed that the two cuts were non-intersecting. What happens if there are still only exactly two minimal cuts, but $\gamma _{A,1} \cap \gamma _{A,2}$ is non-empty? This extension is straightforward. Let $\gamma _A^{(a)}:= \gamma _{A,1} \cap \gamma _{A,2}$ and let $\gamma _{A,i}^{(b)} = \gamma _{A,i} {\setminus } \gamma _A^{(a)}$ for $i = 1,2$. In line with previous notation, let $\mu _{\gamma _A^{(a)}}$ and $\mu _{\gamma _{A,i}^{(b)}}$ denote the corresponding limiting distributions of the entanglement spectra along these sets, with moments $m_{\gamma _A^{(a)},k}$ and $m_{\gamma _{A,i}^{(b)},k}$. The only step in the proof of Proposition 3.3 where we used that the cuts were non-intersecting is when we computed the value of $Z_k(\{\pi _x\})$ for the optimal configuration. If the cuts do intersect, and we consider the configuration with $\pi _x = \tau $ for $x \in V_1$ with $\tau $ the complete cycle, $\pi _x = \sigma $ for $x \in V_2$ and $\sigma \in NC(k)$, and $\pi _x = \textrm{id}$ for $x \in V_3$, then a quick calculation shows

$$\begin{aligned} Z_{k}(\{\pi _x\})&\rightarrow D_{\gamma _{A,1}}^{-d(\pi ,\textrm{id})} (D_{\gamma _A,1}/D_{\gamma _A,2})^{d(\sigma ,\textrm{id})} \prod _{e \in \gamma _A^{(a)}} m_{e,k}\\ {}&\quad \prod _{e_1 \in \gamma ^{(b)}_{A,1}} m_{e_1,\pi ^{-1}\sigma } \prod _{e_2 \in \gamma ^{(b)}_{A,2}} m_{e_2,\sigma }. \end{aligned}$$

Apart from this modification, the proof of Proposition 3.3 is still valid, leading to

$$\begin{aligned} Z_{k,\tau } = m_{\gamma _A^{(a)},k} \sum _{\sigma \in NC(k)} t^{d(\sigma ,\textrm{id})}m_{\gamma _{A,1}^{(b)},\tau ^{-1}\sigma } m_{\gamma _{A,2}^{(b)},\sigma }. \end{aligned}$$

If, in Theorem 3.4, we do not assume that the cuts are non-intersecting, then the partition function above leads to a limiting distribution given by

$$\begin{aligned} \mu _{\gamma _A^{(a)}} \otimes \left( MP(1) \boxtimes \mu _{\gamma _{A,1}^{(b)}} \boxtimes \mu _{\gamma _{A,2}^{(b)}}(t)\right) . \end{aligned}$$

3.3 Non-trivial Link States and Entanglement Negativity

As another application of the theory of free probability, we will compute the entanglement negativity spectrum for random tensor network states with link states with bounded spectra. In [29], it was shown how to compute the entanglement negativity spectrum for a random tensor network state with maximally entangled link states using a replica trick. Using the methods from the previous subsection, we can analyze the negativity for entangled link states with bounded spectral variation. We remark that similar computations have recently been performed in [28] in the context of replica wormholes, and our assumption on the link states is a generalization of the “pairwise connected regime” in [28]. Another work investigating non-trivial entanglement negativity spectra in random tensor networks is [42], where they focus on the effect of having multiple minimal cuts in the network. As our analysis will be a straightforward combination of the arguments in [29] and Sect. 3.2, we will be rather concise; the main message of this section is to show that the language of free probability applies to other random tensor network computations as well.

We first recall how negativity functions as an entanglement measure for mixed states. Let ${\mathcal {T}}$ be the superoperator which maps an operator X to its transpose $X^{{\hspace{-0.83328pt}{\textsf{T}}}}$, and ${\mathcal {I}}$ be the identity superoperator. For $\rho _{AB} \in {\mathcal {P}}(AB)$,

$$\begin{aligned} \rho _{AB}^{T_B} := ({\mathcal {I}}_A \otimes {\mathcal {T}}_B)(\rho _{AB}) \end{aligned}$$

is the partial transpose of $\rho _{AB}$ on the B system. The logarithmic or entanglement negativity is given by

$$\begin{aligned} E_N(\rho _{AB}) = \log \frac{\Vert \rho _{AB}^{T_B}\Vert _1}{{{\,\textrm{tr}\,}}[\rho ]}. \end{aligned}$$

It is a measure for the entanglement of the mixed state $\rho _{AB}$: if $E_N(\rho _{AB}) > 0$ the state must be entangled. We call ${{\,\textrm{spec}\,}}(\left|\rho _{AB}^{T_B}\right|)$ the entanglement negativity spectrum. In analogy to the Rényi entropies, we can generalize the logarithmic negativity to a one-parameter family of negativities. The kth Rényi negativity is given by

$$\begin{aligned} N_k(\rho _{AB}) = {{\,\textrm{tr}\,}}\left[(\rho _{AB}^{T_B})^k\right]. \end{aligned}$$

If we let $N^{(\textrm{even})}_m(\rho _{AB}) = N_{2m}(\rho _{AB})$, then the logarithmic negativity is obtained as an analytic continuation in the Rényi index $m \rightarrow \frac{1}{2}$ of $\log (N^{(\textrm{even})}_m(\rho _{AB}))$. More precisely, in the expression

$$\begin{aligned} \log \sum _{\lambda \in {{\,\textrm{spec}\,}}(\rho _{AB}^{T_B})} \left|\lambda \right|^{\alpha }, \end{aligned}$$

we may take $\alpha \rightarrow \frac{1}{2}$ to obtain $E_N(\rho _{AB}) + \log {{\,\textrm{tr}\,}}[\rho ]$.

In the context of random tensor networks, we partition the boundary in three regions: $V_{\partial } = A \sqcup B \sqcup C$, and we would like to compute the Rényi negativities of the reduced state $\rho _{AB}$. We will then use this to determine the entanglement negativity spectrum and compute the entanglement negativity. The idea is that the kth Rényi negativity can be computed using a replica trick, by placing the full cycle $\tau _k = (1 2 \ldots k) \in S_k$ on A and $\tau _k^{-1} = (k \, k-1 \ldots 1)$ on B:

$$\begin{aligned} N_k(\rho _{AB})&= {{\,\textrm{tr}\,}}\left[\rho _{AB}^{\otimes k} \left( R_A(\tau ) \otimes R_B(\tau ^{-1})\right) \right] \\ {}&= {{\,\textrm{tr}\,}}\left[\rho _{ABC}^{\otimes k} \left( R_A(\tau ) \otimes R_B(\tau ^{-1}) \otimes R_C(\textrm{id})\right) \right]. \end{aligned}$$

Let us first discuss the case with maximally entangled link states, following [29]. The same arguments as in Sect. 2.2 show that one can compute the expectation of $N_k(\rho _{AB})$ for a random tensor network state using a spin model, now with boundary conditions of $\tau _k$ on A, $\tau _k^{-1}$ on B and $\textrm{id}$ on C. We will assume that the minimal cuts $\Gamma _A$, $\Gamma _B$ and $\Gamma _C$ are unique. Note that the minimal cut for AB is given by $\Gamma _{AB} = \Gamma _C^c$. From the theory of multi-commodity flows, it is known that there exist sets of edge-disjoint paths $P = P_{AB} \cup P_{AC} \cup P_{BC}$, where $P_{AB}$ consists of paths from A to B, and similarly for $P_{AC}$ and $P_{BC}$, and which are such that

$$\begin{aligned} \left|P_{AB}\right| + \left|P_{AC}\right| = \left|\gamma _A\right|, \qquad \left|P_{AB}\right| + \left|P_{BC}\right| = \left|\gamma _B\right|, \qquad \left|P_{AC}\right| + \left|P_{BC}\right| = \left|\gamma _C\right|. \end{aligned}$$

This can be used to show (in analogous fashion to the proof of Proposition 3.3) that if $k = 2n$ is even, any spin model configuration contributing to $\mathbbm {E}N_k(\rho _{AB})$ is of order ${\mathcal {O}}(D^{-(n-1)(\left|\gamma _A\right| + \left|\gamma _B\right|) - n\left|\gamma _C\right|})$. If $k = 2n+1$ is odd, any spin model configuration contributing to $\mathbbm {E}N_k(\rho _{AB})$ is of order ${\mathcal {O}}(D^{-n(\left|\gamma _A\right| + \left|\gamma _B\right| + \left|\gamma _C\right|)})$.

In order to determine what happens as $D \rightarrow \infty $, we need to determine the dominant configurations. Let r be the number of connected components of $V {\setminus } (\Gamma _A \cup \Gamma _B \cup \Gamma _C)$. There are two distinct cases. The first is when the minimal cut for AB (which is the complement of the minimal cut for C) is the union of the minimal cuts for A and B, so $\Gamma _{AB} = \Gamma _A \cup \Gamma _B$ and hence $\gamma _{AB} = \gamma _A \cup \gamma _B$. Then, the minimal cuts naturally partition the bulk vertices into three cuts $\Gamma _A$, $\Gamma _B$, and $\Gamma _C$, and we have $r=0$. In this case, the dominant configurations in the spin model are those where the vertices in $\Gamma _A$ are assigned $\tau _k$, those in $\Gamma _B$ are assigned $\tau _k^{-1}$ and those in $\Gamma _C$ are assigned $\textrm{id}$. This is illustrated in Fig. 4a.

The second case is when $\Gamma _A \cup \Gamma _B \subsetneq \Gamma _{AB}$ and hence $\gamma _{AB} \ne \gamma _A \cup \gamma _B$. Now, we have again the domains $\Gamma _A$, $\Gamma _B$ and $\Gamma _C$, but upon removing these vertices, there may also be connected components $V_1, \ldots , V_r$ which are not connected to A, B or C. Here, the minimal configurations are those for which, again, the vertices in $\Gamma _A$ are assigned $\tau _k$, those in $\Gamma _B$ are assigned $\tau _k^{-1}$ and those in $\Gamma _C$ are assigned $\textrm{id}$, and where in each component $V_i$ the vertices are assigned a permutation $\pi _i$ which is such that it satisfies three conditions: It must be on a geodesic between $\tau _k$ and $\tau _k^{-1}$, on a geodesic between $\tau _k$ and $\textrm{id}$ and on a geodesic between $\tau _k^{-1}$ and $\textrm{id}$. If $k = 2n$ is even, such permutations are given by non-crossing pairings: permutations corresponding to non-crossing partitions in which each cycle has length 2. The set of non-crossing pairings on 2n elements is in bijection with the set of non-crossing partitions on n elements, so the number of non-crossing pairings on 2n elements is given by $\left|NC(n)\right| = C_n$. One way to obtain this correspondence is as follows. If $\pi $ is a non-crossing pairing, $\tau _{2n} \pi $ will map even numbers to even numbers, and restricting to the even numbers and relabeling $2i \mapsto i$ yields a non-crossing partition $\sigma \in NC(n)$. Moreover, restricting to the odd numbers and relabeling $2i + 1 \mapsto i$ yields the non-crossing partition $\sigma ^{-1}\tau \in NC(n)$. This leads to $C_n^r$ dominant contributions to $\mathbbm {E}N_{2n}(\rho _{AB})$ of size $D^{-(n-1)(\left|\gamma _A\right| + \left|\gamma _B\right|) - n\left|\gamma _C\right|}$ since we can choose a non-crossing pairing $\pi _i$ for each component. Such a configuration is illustrated in Fig. 4b.

For odd $k = 2n + 1$, we similarly have permutations which correspond to a non-crossing partition, and which have a single fixed point and all other cycles with length 2. This leads to $((2n + 1)C_n)^r$ dominant contributions to $\mathbbm {E}N_{2n}(\rho _{AB})$, of size $D^{-(n-1)(\left|\gamma _A\right| + \left|\gamma _B\right|) - n\left|\gamma _C\right|}$. We also note that ${{\,\textrm{rank}\,}}(\rho _{AB}^{T_B}) \le D^{\left|\gamma _A\right| + \left|\gamma _B\right|}$. If ${{\,\textrm{spec}\,}}(\rho _{AB}^{T_B}) = \{s_i\}$, then we define the measure

$$\begin{aligned} \mu ^{(D)}_{AB} = \frac{1}{D^{\left|\gamma _A\right| + \left|\gamma _B\right|}} \sum _{i=1}^{D^{\left|\gamma _A\right| + \left|\gamma _B\right|}} \delta _{D^{\frac{1}{2}(\left|\gamma _A\right| + \left|\gamma _B\right| + \left|\gamma _C\right|)}s_i}. \end{aligned}$$

(3.26)

This has moments given by

$$\begin{aligned} m^{(D)}_{AB,k} = \int x^k \textrm{d}\mu ^{(D)}_{AB}(x) = D^{(\tfrac{k}{2}-1)(\left|\gamma _A\right| + \left|\gamma _B\right|) + \tfrac{k}{2}\left|\gamma _C\right|} N_{k}(\rho _{AB}). \end{aligned}$$

If we take the expectation of the moments, we again need to distinguish the two cases. If $\left|\gamma _A\right| + \left|\gamma _B\right| = \left|\gamma _C\right|$, we see that the powers of D cancel for the dominant configurations, so $m^{(D)}_{AB,k} \rightarrow 1$ for all k. On the other hand, for $\left|\gamma _A\right| + \left|\gamma _B\right| > \left|\gamma _C\right|$, we see that for $D \rightarrow \infty $ with odd k, we have $\mathbbm {E}m^{(D)}_{AB,k} \rightarrow 0$. For even k, we recover the degeneracy of the dominant configurations, leading to

$$\begin{aligned} \lim _{D \rightarrow \infty } m^{(D)}_{AB,k} = {\left\{ \begin{array}{ll} 1 &{} \text {if }\left|\gamma _A\right| + \left|\gamma _B\right| = \left|\gamma _C\right|,\\ 0 &{} \text {if }k\text { odd and }\left|\gamma _A\right| + \left|\gamma _B\right|> \left|\gamma _C\right|,\\ C_{k/2}^r &{} \text {if }k\text { even and }\left|\gamma _A\right| + \left|\gamma _B\right| > \left|\gamma _C\right|, \end{array}\right. } \end{aligned}$$

(3.27)

where r is the number of connected components of $V {\setminus } (\Gamma _A \cup \Gamma _B \cup \Gamma _C)$. In fact, one can show that, as in Proposition 3.3, the variance goes to zero as well, and hence, the method of moments allows one to conclude that $\mu ^{(D)}_{AB} \Rightarrow \mu _{AB}$, in probability, where

$$\begin{aligned} \mu _{AB} = {\left\{ \begin{array}{ll} \sigma ^{\otimes r} &{} \text {if }r> 0,\\ \frac{1}{2}\delta _1 + \frac{1}{2}\delta _{-1} &{} \text {if }r=0\text { and }\left|\gamma _A\right| + \left|\gamma _B\right| > \left|\gamma _C\right|,\\ \delta _1 &{} \text {if }r=0\text { and }\left|\gamma _A\right| + \left|\gamma _B\right| = \left|\gamma _C\right|, \end{array}\right. } \end{aligned}$$

(3.28)

where $\sigma $ is the semicircle distribution with density

$$\begin{aligned} \textrm{d}\sigma (x) = \frac{1}{2\pi }{\sqrt{4-x^2}} \mathbbm {1}_{\left|x\right|\le 2}\,dx \end{aligned}$$

Alternatively, one may study the empirical distribution of the squared entanglement negativity spectrum

$$\begin{aligned} \nu _{AB}^{(D)} = \frac{1}{D^{\left|\gamma _A\right| + \left|\gamma _B\right|}} \sum _i \delta _{D^{\left|\gamma _A\right| + \left|\gamma _B\right| + \left|\gamma _C\right|}s_i^2}. \end{aligned}$$

(3.29)

This distribution has kth moment given by $m^{(D)}_{AB,2k}$, and in comparison with the limiting moments in Eq. (3.27), one can conclude that $\nu _{AB}^{(D)} \Rightarrow \nu _{AB}$, in probability, where

$$\begin{aligned} \nu _{AB} = MP(1)^{\otimes r}. \end{aligned}$$

(3.30)

The logarithmic negativity can be computed using the distribution $\mu ^{(D)}_{AB}$ or $\nu ^{(D)}_{AB}$ as

$$\begin{aligned} E_N(\rho _{AB})&= \log \int \left|\lambda \right| \textrm{d}\mu ^{(D)}_{AB}(\lambda ) + \frac{\log D}{2}(\left|\gamma _A\right| + \left|\gamma _B\right| - \left|\gamma _C\right|) - \log {{\,\textrm{tr}\,}}\left[\rho \right] \end{aligned}$$

(3.31)

$$\begin{aligned}&= \log \int \sqrt{\lambda } \textrm{d}\nu ^{(D)}_{AB}(\lambda ) + \frac{\log D}{2}(\left|\gamma _A\right| + \left|\gamma _B\right| - \left|\gamma _C\right|) - \log {{\,\textrm{tr}\,}}\left[\rho \right]. \end{aligned}$$

(3.32)

The convergence of $\nu ^{(D)}_{AB}$ to $\nu _{AB}$ implies^{Footnote 3} that $E_N(\rho _{AB}) - \frac{\log D}{2}(\left|\gamma _A\right| + \left|\gamma _B\right| - \left|\gamma _C\right|)$ converges in probability to

$$\begin{aligned} \log \int \sqrt{\lambda } \textrm{d}\nu _{AB}(\lambda ) = r\log \frac{8}{3\pi }. \end{aligned}$$

See Appendix D of [29] for details and proofs.

A straightforward combination of the arguments in Sect. 3.2 and [29] shows that the same configurations are the dominant contributions for link states with bounded spectral variation as in Sect. 3.2. To determine the limiting distribution in this case, we can generalize Eq. (3.28) in the same fashion as in Sect. 3.2. We assume the minimal cuts $\Gamma _A$, $\Gamma _B$ and $\Gamma _C$ are unique. We also assume that $\gamma _A \cap \gamma _B = \emptyset $, and in the case where $\gamma _C = \gamma _{AB} \ne \gamma _A \cup \gamma _B$ (so $\left|\gamma _A\right| + \left|\gamma _B\right| > \left|\gamma _C\right|$), all pairwise intersections between $\gamma _A$, $\gamma _B$ and $\gamma _C$ are empty. This excludes the case where $\left|\gamma _A\right| + \left|\gamma _B\right| > \left|\gamma _C\right|$, but $r = 0$. We let $\gamma _{A,i}$ and $\gamma _{B,i}$ denote the components of $\gamma _A$ and $\gamma _B$ which are connected to $V_i$, and we let $\mu ^{(D)}_{\gamma _{A,i}}$ and $\mu ^{(D)}_{\gamma _{B,i}}$ denote the distribution of the spectrum along these sets, with associated kth moments $m^{(D)}_{\gamma _{A,i},k}$, $m^{(D)}_{\gamma _{B,i},k}$, which we assume to converge to the moments $m_{\gamma _{A,i},k}$, $m_{\gamma _{B,i},k}$ of compactly supported distributions $\mu _{\gamma _{A,i}}$ and $\mu _{\gamma _{B,i}}$. For convenience, we assume $D_e = D$ for all edges $e \in E$.

We can now compute the dominant contributions to $\mathbbm {E}N_{k}(\rho _{AB})$. If $\gamma _C = \gamma _A \cup \gamma _B$, then there is a unique dominant configuration, which contributes $D^{-(k-1)\left|\gamma _C\right|} m_{\gamma _C,k}$. If $\left|\gamma _A\right| + \left|\gamma _B\right| > \left|\gamma _C\right|$ and $k =2n$ is even, consider the configuration which assigns $\pi _i$ to $V_i$, where each $\pi _i$ is a non-crossing pairing. For each edge $e \in \gamma _C$, we have $m^{(D)}_{e,\pi _i} = (m^{(D)}_{e,2})^{n}$, so this configuration contributes

$$\begin{aligned} D^{-(n-1)(\left|\gamma _A\right| + \left|\gamma _B\right|) - n\left|\gamma _C\right|}\left( m^{(D)}_{\gamma _C,2}\right) ^{n} \prod _{i=1}^r\bigl (m^{(D)}_{\gamma _{A,i},\tau _{2n}^{-1} \pi _i} m^{(D)}_{\gamma _{B,i},\tau _{2n} \pi _i} \bigr ) . \end{aligned}$$

Recalling the construction of the equivalence between NC(n) and non-crossing pairings on 2n elements, we see that

$$\begin{aligned} m^{(D)}_{\gamma _{B,i},\tau _{2n} \pi _i}&= m^{(D)}_{\gamma _{B,i},\sigma _{i}} m^{(D)}_{\gamma _{B,i},\sigma _i^{-1}\tau _{n}} \end{aligned}$$

for some unique $\sigma _i \in NC(n)$. Similarly, one may verify

$$\begin{aligned} m^{(D)}_{\gamma _{A,i},\tau _{2n}^{-1} \pi _i}&= m^{(D)}_{\gamma _{A,i},\sigma _{i}} m^{(D)}_{\gamma _{A,i},\sigma _i^{-1}\tau _{n}}. \end{aligned}$$

This implies that the contribution of all dominant configurations is given by

$$\begin{aligned} \left( m^{(D)}_{\gamma _C,2}\right) ^{n}\prod _{i=1}^r\Bigg (\sum _{\sigma \in NC(n)} m^{(D)}_{\gamma _{A,i},\sigma } m^{(D)}_{\gamma _{B,i},\sigma } m^{(D)}_{\gamma _{A,i},\sigma ^{-1}\tau _n} m^{(D)}_{\gamma _{B,i},\sigma ^{-1}\tau _n}\Bigg ) \end{aligned}$$

As in the maximally entangled case, upon rescaling, the odd moments vanish as $D \rightarrow \infty $. In conclusion, the resulting asymptotic moments are given by

$$\begin{aligned} \lim _{D \rightarrow \infty } \mathbbm {E}m^{(D)}_{AB,k} = {\left\{ \begin{array}{ll} m_{\gamma _C,k} &{} \text {if }\left|\gamma _A\right| + \left|\gamma _B\right| = \left|\gamma _C\right|,\\ 0 &{} \text {if }k\text { odd and }\left|\gamma _A\right| + \left|\gamma _B\right|> \left|\gamma _C\right|,\\ m_k &{} \text {if }k\text { even and }\left|\gamma _A\right| + \left|\gamma _B\right| > \left|\gamma _C\right| \end{array}\right. } \end{aligned}$$

(3.33)

with

$$\begin{aligned} m_{2n} = m_{\gamma _C,2}^{n}\prod _{i=1}^r\Bigg (\sum _{\sigma \in NC(n)} m_{\gamma _{A,i},\sigma } m_{\gamma _{B,i},\sigma } m_{\gamma _{A,i},\sigma ^{-1}\tau _n} m_{\gamma _{B,i},\sigma ^{-1}\tau _n}\Bigg ). \end{aligned}$$

As before, one can also show, in similar fashion to the proof of Proposition 3.3, that the variance of the moments goes to zero as $D \rightarrow \infty $. For the case $\left|\gamma _A\right| + \left|\gamma _B\right| > \left|\gamma _C\right|$, we consider $\nu ^{(D)}_{AB}$ similar to Eq. (3.29), but with an additional rescaling by $m_{\gamma _C,2}$:

This has moments, which compute $N^{(\textrm{even})}_k(\rho _{AB})$, converging to

$$\begin{aligned} \lim _{D \rightarrow \infty } \mathbbm {E}\int x^k \textrm{d}\nu _{AB}^{(D)}(x) = \prod _{i=1}^r\Biggl (\sum _{\sigma \in NC(n)} m_{\gamma _{A,i},\sigma } m_{\gamma _{B,i},\sigma } m_{\gamma _{A,i},\sigma ^{-1}\tau _n} m_{\gamma _{B,i},\sigma ^{-1}\tau _n}\Biggr ). \end{aligned}$$

Thus, by the method of moments and Theorem 3.2, it holds that $\nu ^{(D)}_{AB} \Rightarrow \nu _{AB}$, in probability, where

$$\begin{aligned} \nu _{AB} = {\left\{ \begin{array}{ll} \bigotimes _{i = 1}^r \nu _i &{} \text {if }r > 0,\\ \mu _{\gamma _C} &{} \text {if }r=0, \end{array}\right. } \end{aligned}$$

(3.34)

and where $\nu _i$ is given by

$$\begin{aligned} \nu _i = (\mu _{\gamma _{A,i}} \otimes \mu _{\gamma _{B,i}})^{\boxtimes 2} \boxtimes {{\,\textrm{MP}\,}}(1). \end{aligned}$$

This reduces to Eq. (3.30) if the link states are maximally entangled. We can use this to compute the logarithmic negativity, as we did previously. For $r > 0$,

$$\begin{aligned} E_N(\rho _{AB})&= \log \int \sqrt{\lambda } \textrm{d}\nu _{AB}^{(D)}(\lambda ) + \frac{\log D}{2}(\left|\gamma _A\right| + \left|\gamma _B\right| - \left|\gamma _C\right|) \\ {}&\quad + \frac{1}{2}\log m_{\gamma _{C,2}} - \log {{\,\textrm{tr}\,}}\left[\rho \right], \end{aligned}$$

from which we find that $E_N(\rho _{AB}) - \frac{\log D}{2}(\left|\gamma _A\right| + \left|\gamma _B\right| - \left|\gamma _C\right|)$ converges in probability to

$$\begin{aligned} \log \int \sqrt{\lambda } \textrm{d}\nu _{AB}(\lambda ) + \frac{1}{2}\log m_{\gamma _{C,2}}. \end{aligned}$$

For the case $\left|\gamma _A\right| + \left|\gamma _B\right| = \left|\gamma _C\right|$, it is more elegant to use the limiting distribution of $\mu ^{(D)}_{AB}$, as defined in Eq. (3.26). By the method of moments and Eq. (3.33), $\mu ^{(D)}_{AB} \Rightarrow \mu _{\gamma _C}$, in probability. We may then compute the entanglement negativity as

$$\begin{aligned} E_N(\rho _{AB}) = \log \int \left|\lambda \right| \textrm{d}\mu _{AB}^{(D)}(\lambda ) + \frac{\log D}{2}(\left|\gamma _A\right| + \left|\gamma _B\right| - \left|\gamma _C\right|) - \log {{\,\textrm{tr}\,}}\left[\rho \right], \end{aligned}$$

and hence $E_N(\rho _{AB}) - \frac{\log D}{2}(\left|\gamma _A\right| + \left|\gamma _B\right| - \left|\gamma _C\right|)$ converges in probability to $\log \int \left|\lambda \right| \textrm{d}\mu _{AB}(\lambda )$.

4 Link States with Unbounded Spectral Variation

We will now consider a different regime, where the link states have unbounded spectral variation. Our methods in this section are distinct from the previous one, and the two sections can be considered separately.

4.1 One-Shot Entropies

We begin by introducing one of our main tools for studying entanglement spectra in random tensor network states: one-shot entropies. In quantum information theory, the rates of certain important protocols, such as compression or state merging can be expressed as entropic quantities. One-shot entropies are the appropriate analogs for settings where one would like to analyze a task for a single or finite number of copies of the relevant state. Asymptotic rates in terms of ordinary von Neumann entropies are then recovered in the limit of infinitely many independent copies. For an extensive introduction to this point of view, see [74]; here, we provide the basic definitions and introduce the relevant concepts.

A random tensor network built from link states that are maximally entangled (or more generally have bounded spectral variation) can be analyzed using asymptotic tools. Indeed, if we have a maximally entangled state of large dimension $D=2^n$, then this is equal to the nth tensor power of a qubit maximally entangled state, so we are effectively in an asymptotic situation. However, if we allow for link states with unbounded spectral variation or even completely general background states, as in Sect. 2.3, then it is more natural to use tools from one-shot quantum information theory.

We take the Rényi entropies as a starting point, which we defined in Eq. (2.7) for subnormalized states. Let ${\mathcal {H}}$ be some Hilbert space and for $\rho \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$ we define the (unconditional) min-entropy and the max-entropy by

$$\begin{aligned} H_{\min }(\rho )&= -\log \, \Vert \rho \Vert _{\infty }\\ H_{\max }(\rho )&= \log \left( {{\,\textrm{tr}\,}}[\sqrt{\rho }]^2\right) \end{aligned}$$

which coincide with the Rényi entropies $H_{\infty }(\rho )$ and $H_{\frac{1}{2}}(\rho )$ for $\rho \in {\mathcal {P}}_{\scriptscriptstyle {=}}({\mathcal {H}})$. As usual, if $\rho _A$ is the reduced density matrix on a system A, we write $H_{\min }(A)_{\rho } = H_{\min }(\rho _A)$ and $H_{\max }(A)_{\rho } = H_{\max }(\rho _A)$.

Often, when applied to study quantum information processing tasks, it is useful to allow a small error. This leads to the introduction of smooth entropies. To define these, we use a distance measure known as the purified distance, which is given for $\rho ,\sigma \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$ by

$$\begin{aligned} P(\rho , \sigma ) = \sqrt{1 - F_*(\rho ,\sigma )^2} \end{aligned}$$

where $F_*(\rho ,\sigma )$ is the generalized fidelity between $\rho $ and $\sigma $, which is defined by

$$\begin{aligned} F_*(\rho ,\sigma ) = F(\rho ,\sigma ) + \sqrt{(1 - {{\,\textrm{tr}\,}}\left[\rho \right])(1 - {{\,\textrm{tr}\,}}\left[\sigma \right])} \end{aligned}$$

in terms of the ordinary fidelity $F(\rho ,\sigma ) = \Vert \sqrt{\rho }\sqrt{\sigma }\Vert _1$. We define the smooth min- and max-entropies of $\rho \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$ as

$$\begin{aligned} H^\varepsilon _{\min }(\rho )&= \sup _{\rho ^{\varepsilon } \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}}), P(\rho ^\varepsilon ,\rho ) \le \varepsilon } H_{\min }(\rho ^\varepsilon )\\ H^\varepsilon _{\max }(\rho )&= \inf _{\rho ^{\varepsilon }\in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}}), P(\rho ^\varepsilon ,\rho ) \le \varepsilon } H_{\max }(\rho ^\varepsilon ). \end{aligned}$$

The smooth entropies are such that one recovers the usual von Neumann entropies in the limit of many independent copies. Indeed, the following asymptotic equipartition property holds:

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{1}{n} H^{\varepsilon }_{\min }(\rho ^{\otimes n}) = H(\rho ) = \lim _{n \rightarrow \infty } \frac{1}{n} H^{\varepsilon }_{\max }(\rho ^{\otimes n}) \end{aligned}$$

for any $0< \varepsilon < 1$. Variations on this definition are possible. For instance, one can choose a different distance measure, which will yield different entropies. However, for the usual choices, the differences go to zero as $\varepsilon $ goes to zero, so the particular choice is often immaterial. For instance, consider the trace distance between $\rho , \sigma \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$, which is defined by

$$\begin{aligned} T(\rho ,\sigma ) = \frac{1}{2}\Vert \rho - \sigma \Vert _1 + \frac{1}{2}\left|{{\,\textrm{tr}\,}}\left[\rho - \sigma \right]\right|, \end{aligned}$$

where the last term, which is absent in usual definitions of the trace distance, accounts for subnormalized states. It is easy to see that $T(\rho ,\sigma ) \le \Vert \rho - \sigma \Vert _1 \le 2T(\rho ,\sigma )$. The Fuchs–van de Graaff inequalities (see Lemma 3.17 in [74]) relate the trace distance and purified distance:

$$\begin{aligned} T(\rho ,\sigma ) \le P(\rho ,\sigma ) \le \sqrt{2T(\rho ,\sigma )} \end{aligned}$$

(4.1)

for $\rho , \sigma \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$.

There are also conditional versions of the Rényi entropies. Consider a bipartite quantum state $\rho _{AB} \in {\mathcal {P}}_{\scriptscriptstyle {=}}(AB)$. For the von Neumann entropy, the conditional entropy can simply be defined as an entropy difference, namely $H(A\vert B)_\rho = H(AB)_\rho - H(B)_\rho $. However, it turns out that this is not a good definition in the Rényi case. There are various ways to define a Rényi conditional entropy $H_k(A \vert B)$; we use a version based on the so-called sandwiched Rényi relative entropy. For $k=2$, this gives a quantum conditional collision entropy, which will be useful for defining minimal cuts and which is defined as follows. For $\rho _{AB} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(AB)$, let

$$\begin{aligned} H_2(A\vert B)_{\rho | \rho } := -\log {{\,\textrm{tr}\,}}\left[\left( (I \otimes \rho _B)^{-\frac{1}{4}}\rho _{AB} (I \otimes \rho _B)^{-\frac{1}{4}}\right) ^2\right] + \log {{\,\textrm{tr}\,}}\left[\rho \right]. \end{aligned}$$

(4.2)

Finally, there are also conditional versions of the min- and max-entropy. For $\rho _{AB} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(AB)$ and $\sigma _B \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(B)$, we define

$$\begin{aligned} H_{\min }(A\vert B)_{\rho |\sigma }&= - \inf \{ \lambda : \rho _{AB} \le 2^\lambda I_A \otimes \sigma _B\} \\ H_{\max }(A\vert B)_{\rho |\sigma }&= \log \, \Vert \sqrt{\rho _{AB}}\sqrt{I \otimes \sigma _B}\Vert _1^2 \end{aligned}$$

and we let

$$\begin{aligned} H_{\min }(A \vert B)_{\rho }&= \sup _{\sigma \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(B)} H_{\min }(A\vert B)_{\rho |\sigma }\\ H_{\max }(A \vert B)_{\rho }&= \sup _{\sigma \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(B)} H_{\max }(A\vert B)_{\rho |\sigma }. \end{aligned}$$

We can also define their smoothed versions

$$\begin{aligned} H^\varepsilon _{\min }(A\vert B)_\rho&= \sup _{\rho ^{\varepsilon } \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(AB), P(\rho ^\varepsilon ,\rho ) \le \varepsilon } H_{\min }(A\vert B)_{\rho ^\varepsilon }\\ H^\varepsilon _{\max }(A\vert B)_\rho&= \underset{\rho ^{\varepsilon } \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(AB), P(\rho ^\varepsilon ,\rho ) \le \varepsilon }{\inf } H_{\max }(A\vert B)_{\rho ^\varepsilon }. \end{aligned}$$

There is a duality between (smooth) max- and min-entropies. If $\rho \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(ABC)$ is a pure state, it holds that

$$\begin{aligned} H^{\varepsilon }_{\min }(A\vert B)_{\rho } = - H^{\varepsilon }_{\max }(A\vert C)_{\rho }. \end{aligned}$$

(4.3)

We will use the fact that for a normalized state $\rho _{AB} \in {\mathcal {P}}_{\scriptscriptstyle {=}}(AB)$ (Corollary 5.10 in [74])

$$\begin{aligned} H_{\min }(A\vert B)_{\rho } \le H_2(A\vert B)_{\rho |\rho }. \end{aligned}$$

(4.4)

A final important property of conditional smooth entropies is the data processing inequalities. Let $\Phi $ and $\Psi $ be completely positive and trace-preserving (CPTP) maps, mapping systems A to $A'$ and B to $B'$, respectively, and let $\sigma = (\Phi \otimes \Psi )(\rho )$. If $\Phi $ is also subunital, and $0 \le \varepsilon \le \sqrt{{{\,\textrm{tr}\,}}\rho }$, then Theorem 6.2 of [74] states

$$\begin{aligned} H_{\min }^{\varepsilon }(A' \vert B')_{\sigma } \ge H_{\min }^{\varepsilon }(A \vert B)_{\rho }\quad \text {and}\quad H_{\max }^{\varepsilon }(A' \vert B')_{\sigma } \ge H_{\max }^{\varepsilon }(A \vert B)_{\rho }. \end{aligned}$$

In fact, for the smooth min-entropy the data processing inequality is also valid if $\Phi $ is only trace non-increasing rather than trace-preserving, see [73].

4.2 Recovery Isometries

Recall that we study random tensor network states with link states, pure states placed on each edge whose tensor product forms the full state on edges $\phi = \bigotimes _{e \in E} \phi _e \in {\mathcal {P}}_{\scriptscriptstyle {=}}(V)$ for some graph $G = (V,E)$. In Sect. 2.3, we considered more general background states $\phi _V \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(V)$, where we no longer have a tensor product structure along the edges of some graph, and applying the replica trick does not yield a local spin model for the moments of the tensor network state. This situation is of independent interest, but will also be useful as an intermediate step when applying bounds based on one-shot entropies to link states. In Sect. 3, we studied link states for which the entanglement spectrum of the edge states $\phi _e$ had bounded variation, and we used the replica trick to compute the moments of the spectrum of $\rho _A$ for a boundary subsystem A. For general background states, we saw that the replica trick for $k=2$ extends as in Eq. (2.20). What are the minimal cuts in this setting? Based on Eq. (2.20), a first guess would be that $\Gamma _A \in C(A)$ would be a minimal cut (i.e., correspond to the dominant term in the replica trick) if for all other cuts $\Delta _A \in C(A)$ we would have $H_2(\Gamma _A)_{\phi } \ll H_2(\Delta _A)_\phi $. If the state is a link state, this corresponds to adding weights to the edges of the graph corresponding to the Rényi-2 entropies along the edges, and computing a weighted minimal cut. Indeed, this would yield an accurate approximation of ${{\,\textrm{tr}\,}}[\rho _A^2]$ and hence of $H_2(\rho _A)$. However, if the spectrum of $\rho _A$ is not close to a flat spectrum, this does not imply that ${{\,\textrm{spec}\,}}_+(\rho _A)$ is close to ${{\,\textrm{spec}\,}}_+(\phi _{\Gamma _A})$. We would like to show that for link states with unbounded spectral variation, and an appropriate minimal cut condition for $\Gamma _A \in C(A)$, it is still true that ${{\,\textrm{spec}\,}}_+(\rho _A)$ is close to ${{\,\textrm{spec}\,}}_+(\phi _{\Gamma _A})$.

We will adapt the $k=2$ replica trick for general background states to get a bound on the difference in trace norm between ${{\,\textrm{spec}\,}}_+(\rho _A)$ and ${{\,\textrm{spec}\,}}_+(\phi _{\Gamma _A})$ in terms of conditional Rényi-2 entropies,^{Footnote 4} as defined in Eq. (4.2). In Sect. 4.3, we will use this to formulate a condition for cut minimality in terms of smooth entropies for link states.

The main result of this subsection is a tensor network version of one-shot decoupling. Let $\phi _V \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(V)$. We allow $\phi _V$ to be a general state, which need not be pure and also need not be a product state along the edges of some graph. Let R be a purifying system and $\phi _{VR} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(VR)$ be a purification of $\phi _V$. Then, we can construct the random tensor network state $\rho _{V_{\partial }R}$ where the boundary systems are given by $V_\partial \cup R$, which is a purification of the random tensor network state $\rho _{V_\partial }$ as in Eq. (2.18) by

$$\begin{aligned} \phi _{V_\partial R} = {{\,\textrm{tr}\,}}_{V_b}[(I_{V_\partial R} \otimes \psi ) \phi ] \end{aligned}$$

(4.5)

where $\psi $ is a tensor product of random tensors. We briefly recall our notation for boundary subsystems and cuts: For a boundary subsystem $A \subseteq V_\partial $, we denote its boundary complement by ${\bar{A}} = V_\partial {\setminus } A$, and for a cut $\Gamma _A \in C(A)$, we let $\Gamma _A^c = V {\setminus } \Gamma _A$, which is a cut for ${\bar{A}}$. The purifying system R can be thought of as an additional boundary system in the tensor network construction.

In Theorem 4.4, we will assume that we have a cut $\Gamma _A \in C(A)$ which is such that for all cuts $\Delta _A \in C(A)$ for which $\Delta _A \subsetneq \Gamma _A$ we have $H_2(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c R)_{\phi \vert \phi } \gg 1$, and similarly for all cuts $\Delta _A \in C(A)$ for which $\Gamma _A \subsetneq \Delta _A$ we have $H_2(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A R)_{\phi \vert \phi } \gg 1$. We show that this condition implies that with high probability there exist isometries $V_{A}: {\mathcal {H}}_A \rightarrow {\mathcal {H}}_{\Gamma _A}$ and $V_{{\bar{A}}}: {\mathcal {H}}_{{\bar{A}}} \rightarrow {\mathcal {H}}_{\Gamma _A^c}$ such that

$$\begin{aligned} (V_A \otimes V_{{\bar{A}}} \otimes I_R)\vert \rho \rangle \approx \vert \phi \rangle . \end{aligned}$$

(4.6)

The approximation accuracy will be measured in trace norm. In particular, this implies that ${{\,\textrm{spec}\,}}_+(\rho _A) \approx {{\,\textrm{spec}\,}}_+(\phi _{\Gamma _A})$. If the state $\phi $ is a tensor product of link states, ${{\,\textrm{spec}\,}}_+(\phi _{\Gamma _A})$ is precisely the entanglement spectrum along the cut $\gamma _A$. The isometries $V_A$ and $V_{{\bar{A}}}$ are recovery isometries, which allow us to “recover” $\Gamma _A$ from the A system, and similarly we can recover $\Gamma _A^c$ from ${\bar{A}}$.

The result is closely related to quantum error correction. One way to interpret this is as follows: Consider a subspace ${\mathcal {H}}_S$ of ${\mathcal {H}}_{V}$ and let R be a reference system of dimension $\dim ({\mathcal {H}}_S)$, and $\phi _{VR}$ a maximally entangled state between S and R. Then, Eq. (4.6) can be interpreted as saying that if we encode the subspace S by projecting onto random tensors, the information in $\Gamma _A$ is protected, after encoding, against an erasure error on ${\bar{A}}$. This idea is also discussed in [61] for perfect tensor network models, and in [40] for random tensor networks with maximally entangled link states. In holography, the notion of local recovery isometries and their error correction interpretation goes under the name of entanglement wedge reconstruction or subregion-subregion duality. See [6, 7] for a detailed discussion of entanglement wedge reconstruction in holographic systems with bulk entropy, relating to one-shot entropies. We provide more details in “Appendix B.”

Our approach to showing Eq. (4.6) is that we start by projecting only on the random tensors in $\Gamma _A$, and not on the random tensors in $\Gamma _A^c$. This yields a random tensor network state $\sigma $ on $A\Gamma _A^c R$.

We then show that, by a version of one-shot decoupling, the reduced state on $\sigma _{\Gamma _A^c R}$ has not changed much from $\phi _{\Gamma _A^c R}$. By Uhlmann’s theorem, this implies that there exists an isometry $V_A$ such that $(V_A \otimes I_{\Gamma _A^c R})\vert \sigma \rangle \approx \vert \phi \rangle $. Combining this with a similar result for $\Gamma _A^c$ we obtain Eq. (4.6), as will be made precise in Theorem 4.4.

In our construction of $\sigma $, we can relabel the vertices in the graph, and think of the vertices in $\Gamma _A {\setminus } A$ as the bulk vertices $V_b$, the boundary subsystem A as the complete boundary $V_{\partial }$, and relabel all other subsystems as the reference system R. Then, we prove the following result, which is closely related to the one-shot decoupling results in [24].

Proposition 1.7

Consider a random tensor network state $\rho _{V_\partial R}$ as in Eq. (4.5) with a (purified) background state $\phi _{VR} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(VR)$. Let $A = V_\partial $ and let $\Gamma _A = V$ and suppose that for any cut $\Delta _A \in C(A)$ other than $\Gamma _A$

$$\begin{aligned} H_2(\Gamma _A {\setminus } \Delta _A \vert R)_{\phi \vert \phi } \ge K\end{aligned}$$

then

$$\begin{aligned} \mathbbm {E}\Vert \rho _{R} - \phi _{R}\Vert _1 \le 2^{\tfrac{\left|V_b\right|}{2}}\sqrt{{{\,\textrm{tr}\,}}[\phi ]}2^{-\frac{1}{2} K}. \end{aligned}$$

Note that since $\Gamma _A = V_b \cup A$, the sets $\Gamma _A {\setminus } \Delta _A$ for $\Delta _A \in C(A) {\setminus } \{\Gamma _A\}$ are exactly the non-empty subsets of $V_b$. The formulation in terms of $\Delta _A \in C(A)$ will be natural when we apply this result in Theorem 4.4.

Proof

We closely follow the strategy in [24, 26]. We first note a basic fact (Lemma 3.7 in [24]): For any operator X and $\omega $ a subnormalized density matrix, it holds that

$$\begin{aligned} \Vert X\Vert _1 \le \Vert \omega ^{-\frac{1}{4}} X \omega ^{-\frac{1}{4}}\Vert _2. \end{aligned}$$

(4.7)

The proof is an application of the Cauchy–Schwarz inequality. We use Eq. (4.7) with $\omega = \phi _{R}$ and Jensen’s inequality to see that

$$\begin{aligned} \mathbbm {E}\Vert \rho _{R} - \phi _{R}\Vert _1 \le \sqrt{\mathbbm {E}{{\,\textrm{tr}\,}}[({\tilde{\rho }}_{R} - {\tilde{\phi }}_{R})^2]} \end{aligned}$$

where ${\tilde{\rho }}_{V_{\partial } R} = (I \otimes \phi _R)^{-\frac{1}{4}} \rho _{V_{\partial } R} (I \otimes \phi _R)^{-\frac{1}{4}}$ and ${\tilde{\phi }}_{VR} = (I \otimes \phi _R)^{-\frac{1}{4}} \phi _{VR} (I \otimes \phi _R)^{-\frac{1}{4}}$. Now, $\mathbbm {E}{\tilde{\rho }}_{R} = {\tilde{\phi }}_{R}$ by Eq. (2.21), and the replica trick in Eq. (2.20) yields

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}[({\tilde{\rho }}_{R} - {\tilde{\phi }}_{R})^2]&= \mathbbm {E}{{\,\textrm{tr}\,}}[{\tilde{\rho }}_{R}^2] - {{\,\textrm{tr}\,}}[{\tilde{\phi }}_{R}^2]\\&= \sum _{\Delta _A \in C(A), \Delta _A \subsetneq \Gamma _A} {{\,\textrm{tr}\,}}[{\tilde{\phi }}_{(\Gamma _A {\setminus } \Delta _A) R}^2]\\&= \sum _{\Delta _A \in C(A), \Delta _A \subsetneq \Gamma _A} {{\,\textrm{tr}\,}}[\phi ] 2^{-H_2(\Gamma _A {\setminus } \Delta _A \vert R)_{\phi |\phi }} \end{aligned}$$

using the definition of ${\tilde{\phi }}$ and Eq. (4.2) and hence

$$\begin{aligned} \left( \mathbbm {E}\Vert \rho _{R} - \phi _{R}\Vert _1\right) ^2 \le 2^{\left|V_b\right|}{{\,\textrm{tr}\,}}[\phi ] 2^{-K}. \end{aligned}$$

$\square $

Suppose that in the setup of Proposition 4.1, we would have equality $\rho _{R} = \phi _{R}$. Then, by Uhlmann’s theorem, their purifications $\rho _{AR}$ and $\phi _{\Gamma _A R}$ are related by an isometry $V_A$ from A to $\Gamma _A$. The following lemma is useful to extend to the case where the reduced states are close in trace distance.

Lemma 1.8

Suppose $\rho _{AB} \in {\mathcal {P}}(AB)$ and $\sigma _{AC} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(AC)$ are pure states on Hilbert spaces ${\mathcal {H}}_A \otimes {\mathcal {H}}_B$ and ${\mathcal {H}}_A \otimes {\mathcal {H}}_C$, respectively. Then,

$$\begin{aligned} \min _{V} \Vert (I_A \otimes V) \rho _{AB} (I_A \otimes V^\dagger ) - \sigma _{AC}\Vert _1 \le 2\sqrt{2\Vert \rho _{A} - \sigma _{A}\Vert _1 + 2\Vert \rho _{A} - \sigma _{A}\Vert _1^2}. \end{aligned}$$

where the minimum is over all isometries $V: {\mathcal {H}}_B \rightarrow {\mathcal {H}}_C$.

Proof

Uhlmann’s theorem states that if $\rho _{AB} \in {\mathcal {P}}(AB)$ and $\sigma _{AC} \in {\mathcal {P}}(AC)$ are pure quantum states with $\dim ({\mathcal {H}}_B) \le \dim ({\mathcal {H}}_C)$, then there exists an isometry $V: {\mathcal {H}}_B \rightarrow {\mathcal {H}}_C$ such that

$$\begin{aligned} P(\rho _A,\sigma _A) = P((I_A \otimes V) \rho _{AB} (I_A \otimes V^\dagger ), \sigma _{AC}) \end{aligned}$$

and, in particular, the isometry is the solution to an optimization problem:

$$\begin{aligned} P(\rho _A, \sigma _A) = \min _V P((I_A \otimes V) \rho _{AB} (I_A \otimes V^\dagger ), \sigma _{AC}). \end{aligned}$$

Moreover, if both $\rho $ and $\sigma $ are subnormalized, by Eq. (4.1), we can bound

$$\begin{aligned} \min _{V} \Vert (I_A \otimes V) \rho _{AB} (I_A \otimes V^\dagger ) - \sigma _{AC}\Vert _1&\le \min _{V} 2P((I_A \otimes V) \rho _{AB} (I_A \otimes V^\dagger ), \sigma _{AC})\\&= 2P(\rho _A,\sigma _A) \\&\le 2\sqrt{2\Vert \rho _{A} - \sigma _{A}\Vert _1}. \end{aligned}$$

From this, it follows that if $\sigma $ is subnormalized and $\rho $ has ${{\,\textrm{tr}\,}}[\rho ] > 1$,

$$\begin{aligned} \min _{V} \Vert (I_A \otimes V) \rho _{AB} (I_A \otimes V^\dagger ) - \sigma _{AC}\Vert _1 \le 2\sqrt{2{{\,\textrm{tr}\,}}[\rho ]\Vert \rho _{A} - \sigma _{A}\Vert _1}. \end{aligned}$$

Since ${{\,\textrm{tr}\,}}[\rho ] \le {{\,\textrm{tr}\,}}[\sigma ] + \Vert \rho - \sigma \Vert _1$ and ${{\,\textrm{tr}\,}}[\sigma ] \le 1$, we conclude that

for arbitrary $\rho $ and subnormalized $\sigma $. $\square $

Finally, we will need a basic lemma relating tensor network states with differing background states:

Lemma 1.9

Suppose we consider random tensor network states $\rho _{V_\partial R}$ and ${\tilde{\rho }}_{V_\partial R}$ with (purified) background states $\phi _{V R}, {\tilde{\phi }}_{VR} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(VR)$ and projecting onto the same random tensors. Then,

$$\begin{aligned} \mathbbm {E}\Vert \rho _{V_\partial R} - {\tilde{\rho }}_{V_\partial R} \Vert _1 \le \Vert \phi _{VR} - \tilde{\phi }_{VR}\Vert _1. \end{aligned}$$

Proof

Let $\phi - {\tilde{\phi }} = \Delta _+ - \Delta _-$ where both $\Delta _+$ and $\Delta _-$ are positive semidefinite and are such that $\Vert \phi - {\tilde{\phi }}\Vert _1 = {{\,\textrm{tr}\,}}[\Delta _+ + \Delta _-]$. Then, we can also consider the random tensor network states $\sigma _+$ and $\sigma _-$ which take $\Delta _+$ and $\Delta _-$ as background states, and by the linearity of Eq. (4.5) in the background state we have $\rho - {{\tilde{\rho }}} = \sigma _+ - \sigma _-$. By Eq. (2.21), $\mathbbm {E}\sigma _{\pm } = \Delta _{\pm }$. We then estimate

$$\begin{aligned} \mathbbm {E}\Vert \rho - {{\tilde{\rho }}}\Vert _1&= \mathbbm {E}\Vert \sigma _+ - \sigma _-\Vert _1 \le \mathbbm {E}(\Vert \sigma _+\Vert _1 + \Vert \sigma _-\Vert _1) = \mathbbm {E}{{\,\textrm{tr}\,}}[\sigma _+ + \sigma _-] \\&= {{\,\textrm{tr}\,}}[\Delta _+ + \Delta _-] = \Vert \phi - {\tilde{\phi }}\Vert _1. \end{aligned}$$

where we have used that $\sigma _+$ and $\sigma _-$ are positive semidefinite and hence $\Vert \sigma _{\pm }\Vert _1 = {{\,\textrm{tr}\,}}[\sigma _{\pm }]$. $\square $

With all our tools assembled, we are ready to prove the main result of this subsection. We again let $\phi _{VR} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(VR)$ be a background state with R a purifying system, and we let $\rho _{V_{\partial }R}$ be the associated random tensor network state as constructed in Eq. (4.5). Let $\Gamma _A$ be an arbitrary cut for the boundary region A. In Theorem 4.4, we provide a criterion to determine whether $\Gamma _A$ is a minimal cut in terms of conditional entropies. Informally speaking, the following result shows that if $\Gamma _A$ is a minimal cut in this sense, we can recover the system $\Gamma _A$ from the boundary subsystem A, while we can recover $\Gamma _A^c$ from the boundary subsystem ${\bar{A}}$. For general $\phi $, Theorem 4.4 is closely related to the task of split transfer, see “Appendix B” for a discussion. The following result closely follows Proposition 18 of [26].

Theorem 1.10

(Recovery isometries). Let $\phi _{VR} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(VR)$, and let $\rho _{V_{\partial }R}$ be the associated random tensor network state as in Eq. (4.5). Let $\Gamma _A \in C(A)$ and suppose that

$$\begin{aligned} H_2(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c R)_{\phi |\phi } \ge K_1 \end{aligned}$$

(4.8)

for all cuts $\Delta _A \in C(A)$ such that $\Delta _A \subsetneq \Gamma _A$ and

$$\begin{aligned} H_2(\Delta _A{\setminus } \Gamma _A \vert \Gamma _A R)_{\phi |\phi } \ge K_2 \end{aligned}$$

(4.9)

for all cuts $\Delta _A \in C(A)$ such that $\Gamma _A \subsetneq \Delta _A$. Then,

$$\begin{aligned}&\mathbbm {E}\min _{V_A, V_{{\bar{A}}}} \Vert (V_A \otimes V_{{\bar{A}}} \otimes I_R)\rho _{V_{\partial }R}(V_A^\dagger \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi _{VR}\Vert _1 \nonumber \\ {}&\qquad \qquad = {\mathcal {O}}({{\,\textrm{tr}\,}}[\phi ]^{\frac{1}{4}}(2^{-\frac{1}{4}K_1} + 2^{-\frac{1}{4}K_2})). \end{aligned}$$

(4.10)

where the minimum is over isometries $V_{A}: {\mathcal {H}}_A \rightarrow {\mathcal {H}}_{\Gamma _A}$ and $V_{{\bar{A}}}: {\mathcal {H}}_{{\bar{A}}} \rightarrow {\mathcal {H}}_{\Gamma _A^c}$.

Proof

Let $\sigma _{A\Gamma _A^c R}$ be the state where we have contracted along the tensors in $\Gamma _A$ but not along those in $\Gamma _A^c$, and similarly let $\tau _{{\bar{A}} \Gamma _A R}$ be the state where we have contracted along the tensors in $\Gamma _A^c$ but not along those in $\Gamma _A$. We first use Proposition 4.1 to show that $\sigma _{\Gamma _A^c R} \approx \phi _{\Gamma _A^c R}$ and $\tau _{\Gamma _A R} \approx \phi _{\Gamma _A R}$. Indeed, for $\sigma $ we simply apply Proposition 4.1 with $V_b \cap \Gamma _A$ as the set of bulk vertices, A as the set of boundary vertices and $\Gamma _A^c R$ as the reference system. This gives

$$\begin{aligned} \mathbbm {E}\Vert \sigma _{\Gamma _A^c R} - \phi _{\Gamma _A^c R}\Vert _1 ={\mathcal {O}}(\sqrt{{{\,\textrm{tr}\,}}[\phi ]}2^{-\frac{1}{2} K_1}). \end{aligned}$$

A similar application of Proposition 4.1, with $V_b \cap \Gamma _A^c$ as the set of bulk vertices, ${\bar{A}}$ as the set of boundary vertices and $\Gamma _A R$ as the reference system, shows

$$\begin{aligned} \mathbbm {E}\Vert \tau _{\Gamma _A R} - \phi _{\Gamma _{A R}}\Vert _1 = {\mathcal {O}}\bigg (\sqrt{{{\,\textrm{tr}\,}}[\phi ]}2^{-\frac{1}{2} K_2}\bigg ). \end{aligned}$$

We note that for any isometries $V_{A}: {\mathcal {H}}_A \rightarrow {\mathcal {H}}_{\Gamma _A}$ and $V_{{\bar{A}}}: {\mathcal {H}}_{{\bar{A}}} \rightarrow {\mathcal {H}}_{\Gamma _A^c}$

$$\begin{aligned}&\Vert (V_A \otimes V_{{\bar{A}}} \otimes I_R)\rho (V_A^\dagger \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi _{VR}\Vert _1 \\&\quad \le \Vert (I_{\Gamma _A} \otimes V_{{\bar{A}}} \otimes I_R)\tau (I_{\Gamma _A} \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi _{VR}\Vert _1 \\&\qquad + \Vert \tau - (V_A \otimes I_{{\bar{A}} R})\rho (V_A^\dagger \otimes I_{{\bar{A}} R})\Vert _1, \end{aligned}$$

where we have applied the triangle inequality after adding and subtracting $(I_{\Gamma _A} \otimes V_{{\bar{A}}} \otimes I_R)\tau (I_{\Gamma _A} \otimes V_{{\bar{A}}}^\dagger \otimes I_R)$, and then using the invariance of the trace norm under isometries in the second term. We use this to estimate

$$\begin{aligned} \begin{aligned}&\mathbbm {E}\min _{V_A, V_{{\bar{A}}}} \Vert (V_A \otimes V_{{\bar{A}}} \otimes I_R)\rho (V_A^\dagger \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi _{VR}\Vert _1 \\&\quad \le \mathbbm {E}\bigl ( \min _{V_{{\bar{A}}}} \Vert (I_{\Gamma _A} \otimes V_{{\bar{A}}} \otimes I_R)\tau (I_{\Gamma _A} \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi _{VR}\Vert _1 \\&\qquad + \min _{V_A} \Vert \tau - (V_A \otimes I_{{\bar{A}} R})\rho (V_A^\dagger \otimes I_{{\bar{A}} R})\Vert _1 \bigr ), \end{aligned}\nonumber \\ \end{aligned}$$

(4.11)

where the minimum is over isometries $V_{A}: {\mathcal {H}}_A \rightarrow {\mathcal {H}}_{\Gamma _A}$ and $V_{{\bar{A}}}: {\mathcal {H}}_{{\bar{A}}} \rightarrow {\mathcal {H}}_{\Gamma _A^c}$. For the first term of Eq. (4.11), we apply Lemma 4.2 to get

$$\begin{aligned}&\min _{V_{{\bar{A}}}} \Vert (I_{\Gamma _A} \otimes V_{{\bar{A}}} \otimes I_R)\tau (I_{\Gamma _A} \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi _{VR}\Vert _1\\&\quad \le 2\sqrt{2 \Vert \tau _{\Gamma _A R} - \phi _{\Gamma _{A R}}\Vert _1 + 2\Vert \tau _{\Gamma _A R} - \phi _{\Gamma _{A R}}\Vert _1^2}\\&\quad \le 2\sqrt{2} \left( \sqrt{\Vert \tau _{\Gamma _A R} - \phi _{\Gamma _{A R}}\Vert _1} + \Vert \tau _{\Gamma _A R} - \phi _{\Gamma _{A R}}\Vert _1\right) \end{aligned}$$

and by Jensen’s inequality

$$\begin{aligned}&\mathbbm {E}\min _{V_{{\bar{A}}}} \Vert (I_{\Gamma _A} \otimes V_{{\bar{A}}} \otimes I_R)\tau (I_{\Gamma _A} \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi _{VR}\Vert _1 \nonumber \\&\quad \le 2\sqrt{2} \left( \sqrt{\mathbbm {E}\Vert \tau _{\Gamma _A R} - \phi _{\Gamma _{A R}}\Vert _1} + \mathbbm {E}\Vert \tau _{\Gamma _A R} - \phi _{\Gamma _{A R}}\Vert _1\right) \nonumber \\&\quad = {\mathcal {O}}({{\,\textrm{tr}\,}}[\phi ]^{\frac{1}{4}}2^{-\frac{1}{4} K_2}) \end{aligned}$$

(4.12)

For the second term of Eq. (4.11), we can think of $\tau $ and $(V_A \otimes I_{{\bar{A}} R})\rho (V_A^\dagger \otimes I_{{\bar{A}} R})$ as the random tensor network states with $\phi $ and $(V_A \otimes I_{\Gamma _A^c R})\sigma (V_A^\dagger \otimes I_{\Gamma _A^c R})$ as the full state on edges, applying random tensors in $\Gamma _A^c$. Then, denoting by $\mathbbm {E}_{\Gamma _A^c}$ the expectation value over all random tensors in $\Gamma _A^c$, by Lemma 4.3

$$\begin{aligned} \mathbbm {E}_{\Gamma _A^c} \Vert \tau - (V_A \otimes I_{{\bar{A}} R})\rho (V_A^\dagger \otimes I_{{\bar{A}} R})\Vert _1 = {\mathcal {O}}(\Vert \phi _{VR} - (V_A \otimes I_{{\bar{A}} R})\sigma (V_A^\dagger \otimes I_{{\bar{A}} R})\Vert _1). \end{aligned}$$

We thus estimate

$$\begin{aligned} \mathbbm {E}\min _{V_A}&\Vert \tau - (V_A \otimes I_{{\bar{A}} R})\rho (V_A^\dagger \otimes I_{{\bar{A}} R})\Vert _1 \\ {}&= {\mathcal {O}}(\mathbbm {E}_{\Gamma _A} \min _{V_A} \Vert \phi _{VR} - (V_A \otimes I_{{\bar{A}} R})\sigma (V_A^\dagger \otimes I_{{\bar{A}} R})\Vert _1) \end{aligned}$$

for which we may argue exactly as in Eq. (4.12) and using Lemma 4.2 that

$$\begin{aligned} \mathbbm {E}_{\Gamma _A} \min _{V_A} \Vert \phi _{VR} - (V_A \otimes I_{\Gamma _A^c R})\sigma (V_A^\dagger \otimes I_{\Gamma _A^c R})\Vert _1 = {\mathcal {O}}({{\,\textrm{tr}\,}}[\phi ]^{\frac{1}{4}}2^{-\frac{1}{4} K_1}). \end{aligned}$$

We conclude that

$$\begin{aligned}&\mathbbm {E}\min _{V_A, V_{{\bar{A}}}} \Vert (V_A \otimes V_{{\bar{A}}} \otimes I_R)\rho (V_A^\dagger \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi _{VR}\Vert _1 \\ {}&\qquad \qquad = {\mathcal {O}}({{\,\textrm{tr}\,}}[\phi ]^{\frac{1}{4}}(2^{-\frac{1}{4} K_1} + 2^{-\frac{1}{4} K_2})). \end{aligned}$$

$\square $

We hence find that the closeness of the boundary and background state can be bounded via conditional Rényi-2 entropies of cuts. In particular, for large $K_1,K_2$, the recovery isometries can recover states to good accuracy, and we find that $\mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\phi _{\Gamma _A})\Vert $ is small. However, this result is not yet completely satisfying. The conditional Rényi-2 entropy is not a “robust” quantity, in the sense that a small deformation of $\phi $ can drastically change the values of the conditional Rényi-2 entropies in Eqs. (4.8) and (4.9). For this reason, we would like a condition with smoothed entropies. We first note that one can actually show that for the condition in Eq. (4.8), we can bound

$$\begin{aligned} H_2(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c R)_{\phi \vert \phi } \ge H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c R)_{\phi } \end{aligned}$$

(4.13)

and similarly for Eq. (4.9),

$$\begin{aligned} H_2(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A R)_{\phi \vert \phi } \ge H_{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A R)_{\phi } \end{aligned}$$

(4.14)

To make the condition “robust,” we would like to replace these by smoothed entropies and express a condition in terms of $H^{\varepsilon }_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c R)_{\phi }$ and $H^{\varepsilon }_{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A R)_{\phi }$. This will require simultaneous smoothing: finding a state $\phi ^\varepsilon _{VR} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(VR)$ which is close to $\phi $, such that $H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c R)_{\phi ^{\varepsilon }} \ge H^{\varepsilon }_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c R)_{\phi }$ and $H_{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A R)_{\phi ^{\varepsilon }} \ge H^{\varepsilon }_{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A R)_{\phi }$ for all relevant cuts $\Delta _A$. If we have a general background state, it is not known how this can be done [25, 31]. However, if the background state is actually a tensor product of link states, we can perform the simultaneous smoothing.

4.3 One Minimal Cut

The primary result in this subsection is Theorem 4.7, which states that if the background state is actually a tensor product of link states as in Eq. (2.2), then the spectrum of the boundary state is well approximated by the spectrum of the minimal cut link state in expectation, where the approximation accuracy is controlled by smooth one-shot entropies. It is a straightforward application of Theorem 4.4.

For a cut $\Gamma _A \in C(A)$ we define

$$\begin{aligned} {\mathcal {C}}_1(\Gamma _A)&= \{\Delta _A \in C(A) : \Delta _A \subsetneq \Gamma _A \} \\ {\mathcal {C}}_2(\Gamma _A)&= \{\Delta _A \in C(A) : \Gamma _A \subsetneq \Delta _A \}. \end{aligned}$$

The key result we need is the following lemma, which we prove in “Appendix C,” which shows that if we have a link state, we can perform the desired joint smoothing.

Lemma 1.11

Let $\phi \in {\mathcal {P}}_{\scriptscriptstyle {=}}(V)$ be a link state, $A \subseteq V_{\partial }$ a boundary subsystem and $\Gamma _A \in C(A)$ a cut for A. Then, there exists a pure state $\phi ^{\varepsilon } \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(V)$ which is such that

$$\begin{aligned} P(\phi ,\phi ^\varepsilon ) \le 2\left( \sqrt{\left|{\mathcal {C}}_1(\Gamma _A)\right|} + \sqrt{\left|{\mathcal {C}}_2(\Gamma _A)\right|}\right) \sqrt{\varepsilon } \end{aligned}$$

and it holds that for any $\Delta _A \in {\mathcal {C}}_1(\Gamma _A)$

$$\begin{aligned} H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi ^\varepsilon } \ge H_{\min }^{\varepsilon }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi } \end{aligned}$$

and for any $\Delta _A \in {\mathcal {C}}_2(\Gamma _A)$

$$\begin{aligned} H_{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi ^\varepsilon } \ge H_{\min }^{\varepsilon }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi }. \end{aligned}$$

We now define the notion of a minimal cut for an arbitrary link state.

Definition 1.12

(Generalized minimal cut) A cut $\Gamma _{A}$ is an $(\varepsilon ,K)$-minimal cut if for all $\Delta _A \in {\mathcal {C}}_1(\Gamma _A)$

$$\begin{aligned} H_{\min }^{\varepsilon }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi } \ge K \end{aligned}$$

and for all $\Delta _A \in {\mathcal {C}}_2(\Gamma _A)$

$$\begin{aligned} H_{\min }^{\varepsilon }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi } \ge K. \end{aligned}$$

This definition is consistent with the smooth entropy conditions for minimal surfaces in holography from [6]. The following is now a straightforward consequence of Theorem 4.4 and Lemma 4.5. It justifies our notion of a generalized minimal cut, as it controls the degree to which the spectrum of the corresponding cut link state $\phi _{\Gamma _A}$ is close to the boundary state $\rho _A$.

Theorem 1.13

Consider a random tensor network state $\rho $ constructed with $\phi \in {\mathcal {P}}_{\scriptscriptstyle {=}}(V)$ a tensor product of link states as in Eq. (2.2). Let A be a boundary region of the network, $\rho _A$ the corresponding boundary state, and $\Gamma _A$ an $(\varepsilon ,K)$-minimal cut. Then, the spectra of $\rho _A$ and the state $\phi _{\Gamma _A}$ on A are related as:

$$\begin{aligned} \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\phi _{\Gamma _A})\Vert _1 = {\mathcal {O}}(2^{-\frac{1}{4} K} + \sqrt{\varepsilon }). \end{aligned}$$

Proof

Let $\phi ^{\varepsilon }$ be a state as constructed in Lemma 4.5, and let $\rho ^{\varepsilon }$ be the random tensor network state using this background state. Then, by Theorem 4.4

$$\begin{aligned}&\mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A^{\varepsilon }) - {{\,\textrm{spec}\,}}_+(\phi ^{\varepsilon }_{\Gamma _A})\Vert _1 \\ {}&\quad \le \mathbbm {E}\min _{V_A, V_{{\bar{A}}}} \Vert (V_A \otimes V_{{\bar{A}}} \otimes I_R)\rho ^{\varepsilon }(V_A^\dagger \otimes V_{{\bar{A}}}^\dagger \otimes I_R) - \phi ^{\varepsilon }\Vert _1\\&\quad = {\mathcal {O}}({{\,\textrm{tr}\,}}[ \phi ]^{\frac{1}{4}}(2^{-\frac{1}{4}K_1} + 2^{-\frac{1}{4}K_2})) \end{aligned}$$

where $K_1$ is the minimal value over $\Delta _A \in {\mathcal {C}}_1(\Gamma _A)$ of

$$\begin{aligned} H_2(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi ^{\varepsilon }|\phi ^{\varepsilon }}&\ge H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi ^{\varepsilon }} - \log {{\,\textrm{tr}\,}}[\phi ^\varepsilon ] \\&\ge H_{\min }^{\varepsilon }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi } - \log {{\,\textrm{tr}\,}}[\phi ^\varepsilon ] \end{aligned}$$

using Eq. (4.4), the defining property of $\phi ^{\varepsilon }$ from Lemma 4.5 and accounting for the normalization. Similarly $K_2$ is the minimal value over $\Delta _A \in {\mathcal {C}}_2(\Gamma _A)$ of

$$\begin{aligned} H_2(\Delta _A{\setminus } \Gamma _A \vert \Gamma _A)_{\phi ^{\varepsilon }|\phi ^{\varepsilon }}&\ge H_{\min }(\Delta _A{\setminus } \Gamma _A \vert \Gamma _A)_{\phi ^{\varepsilon }} - \log {{\,\textrm{tr}\,}}[\phi ^\varepsilon ] \\&\ge H_{\min }^{\varepsilon }(\Delta _A{\setminus } \Gamma _A \vert \Gamma _A)_{\phi } - \log {{\,\textrm{tr}\,}}[\phi ^\varepsilon ] \end{aligned}$$

and hence $K_1 \ge K$ and $K_2 \ge K$, so

$$\begin{aligned} \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A^{\varepsilon }) - {{\,\textrm{spec}\,}}_+({{\tilde{\phi }}}^{\varepsilon }_{\Gamma _A})\Vert _1 = {\mathcal {O}}(2^{-\frac{1}{4}K}). \end{aligned}$$

Moreover, $\Vert \phi - \phi ^{\varepsilon }\Vert _1 \le 2T(\phi ,\phi ^{\varepsilon }) \le 2P(\phi ,\phi ^{\varepsilon }) = {\mathcal {O}}(\sqrt{\varepsilon })$ by Eq. (4.1) and hence by Lemma 4.3

$$\begin{aligned} \mathbbm {E}\Vert \rho - \rho ^{\varepsilon }\Vert _1 = {\mathcal {O}}(\sqrt{\varepsilon }). \end{aligned}$$

We conclude that

$$\begin{aligned} \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\phi _{\Gamma _A})\Vert _1&\le \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A^{\varepsilon }) - {{\,\textrm{spec}\,}}_+(\phi ^{\varepsilon }_{\Gamma _A})\Vert _1\\&\quad + \mathbbm {E}\Vert \rho _A - \rho ^{\varepsilon }_A\Vert _1 + \Vert \phi _{\Gamma _A} - \phi ^{\varepsilon }_{\Gamma _A}\Vert _1\\&= {\mathcal {O}}(2^{-\frac{1}{4}K} + \sqrt{\varepsilon }). \end{aligned}$$

$\square $

4.4 Two Competing Minimal Cuts

We now consider the case of two minimal cuts for A, where the link states do not have bounded spectral variation as in Sect. 3. We cannot directly apply Theorem 4.7 if there are two competing minimal cuts (note that we will need to define what this means exactly), nor can we assume that the empirical measure, or even a rescaling of it, will converge. Instead, we will see that the spectrum can be approximated in trace distance in a way that allows one to compute entropies, as well as showing convergence of certain measures depending on the spectrum of the reduced state. In particular, under these regularity conditions, we will prove the main result of this subsection: The distribution defined by the boundary state converges to the measure defined by the pushforward along the $\min $-function acting on the distributions of the two competing minimal cuts. The intuition behind this is that the state on the edges can be approximated by a superposition of two states which both do have a unique minimal cut. As a corollary of this result, we will then show how to approximate the von Neumann entropy of the boundary state in such a situation.

We begin by introducing the relevant family of probability distributions for our purposes. Given a probability distribution vector $p \in \mathbbm {R}^d$ of d outcomes and $f(d) > 0$ and $g(d) \in \mathbbm {R}$, consider the random variable which takes value $f(d)[-\log (p_i) - g(d)]$ with probability $p_i$. It has distribution

$$\begin{aligned} \nu _p = \sum _{i = 1}^d p_i \delta _{f(d)[-\log (p_i) - g(d)]}. \end{aligned}$$

(4.15)

Typically, we will have families of probability distributions with increasing d, let $f(d) = (\log d)^{-1}$ and choose g(d) such that it corresponds to the entropy of p. We will also write $\nu _\rho = \nu _{{{\,\textrm{spec}\,}}(\rho )}$ for a quantum state $\rho $. We may also consider Eq. (4.15) for the case where $p \in \mathbbm {R}^d$ is positive but normalized, in which case $\nu _p$ is a finite measure. To motivate the study of the distribution in Eq. (4.15), we note that it is closely related to the central limit theorem. Let $d = d_0^n$ and $\rho = \rho _0^{\otimes n}$ for some state $\rho _0$ on $\mathbbm {C}^{d_0}$. Then,

$$\begin{aligned} \nu ^{(n)}_\rho = \sum _{\lambda \in {{\,\textrm{spec}\,}}(\rho _0^{\otimes n})} \lambda \delta _{\frac{1}{\sqrt{n}}[-\log (\lambda ) - nH(\rho _0)]} \end{aligned}$$

converges in distribution to a normal distribution by the central limit theorem. We study this particular measure because the empirical measures we investigated in Sect. 3 may not have good convergence properties if, for example, the full state on edges consists of many copies of a single non-maximally entangled link state. Moreover, the measure in Eq. (4.15) clearly captures information about the entropy of the probability distribution, and turns out to capture information about second-order asymptotics of information processing tasks [38, 72]. In fact, the example of many independent copies of a single link state is not the only relevant situation. In [14, 19], it was argued that the entanglement spectrum of conformal field theories with large central charge (which are the motivating example to study random tensor networks) have similar behavior.

Before diving into more technical details, we can build some intuition for how distributions of the form Eq. (4.15) will behave in the two-cut setting. Consider the situation with a single random tensor. Let us take a single bulk vertex x and two boundary vertices a and b, with edges (xa) and (xb). There is a single random tensor at x, and we take link states $\vert \phi _1\rangle ^{\otimes n}$ along (xa) and $\vert \phi _2\rangle ^{\otimes n}$ along (xb). We denote

$$\begin{aligned} \vert \phi _j\rangle = \sum _i \sqrt{\lambda _{i,j}} \vert ii\rangle \end{aligned}$$

for $j = 1,2$ and we let $h_j:= H(\{\lambda _{i,j}\})$ be the entanglement entropies along the two minimal link states. There are two separating cuts for the boundary subsystem $\{a\}$: If $h_1 < h_2$, then (xa) is the minimal cut, and if $h_1 > h_2$, the minimal cut is given by (xb). In these cases, for large n, the entanglement spectrum of $\rho _{\{a\}}$ can be approximated by the entanglement spectrum along the minimal cut. What happens at the “phase transition” where $h_1 = h_2 = h$? The intuition is that we can split $\vert \phi _1\rangle ^{\otimes n} \otimes \vert \phi _2\rangle ^{\otimes n}$ into a superposition of the two states, one for which the minimal surface is at (xa) and one for which the minimal surface is at (xb). We find that in this case, if we let $\sigma _j^2 = \sum _i \lambda _{i,j}(\log (\frac{1}{\lambda _{i,j}}) - h)^2$, then for any bounded continuous function f, the quantity

$$\begin{aligned} \sum _{\lambda \in {{\,\textrm{spec}\,}}(\rho _{\{a\}})} \lambda f\left( \frac{-\log (\lambda ) - nh}{\sqrt{n}}\right) \end{aligned}$$

converges to

$$\begin{aligned} \frac{1}{2\pi \sigma _1\sigma _2} \int \int e^{-\frac{(x_1 - h)^2}{2\sigma _1^2} -\frac{(x_2 - h)^2}{2\sigma _2^2} } f(\min (x_1,x_2))\textrm{d}x_1 \textrm{d}x_2 \end{aligned}$$

in probability as n goes to infinity. Our goal in this section will be to prove a general version of this result for full random tensor networks.

In Sect. 3, we used the method of moments. (If all moments of a distribution converge, and the distribution is uniquely determined by its moments, then we have weak convergence.) However, convergence of moments is a stronger condition than weak convergence, and it requires computation of all moments. In this section, we will use that for distributions of the form Eq. (4.15), convergence in distribution follows from convergence in trace norm:

Lemma 1.14

Consider a sequence of increasing integers $\{d_n\}_{n \in \mathbbm {N}}$ and for each n, positive vectors $p^{(n)}, q^{(n)} \in \mathbbm {R}^{d_n}$. Let $\sigma , h: \mathbbm {N}\rightarrow \mathbbm {R}$ be such that $\sigma (d) > 0$ for $d \in \mathbbm {N}$ and $\lim _{d \rightarrow \infty } \sigma (d)^{-1} = 0$. Let

$$\begin{aligned} \nu ^{(n)}_p := \sum _{i=1}^{d_n} p^{(n)}_i \delta _{\frac{1}{\sigma (d_n)}[-\log (p^{(n)}_i)-h(d_n))]} \end{aligned}$$

and similarly let

$$\begin{aligned} \nu ^{(n)}_{q} = \sum _{i=1}^{d_n} q^{(n)}_i \delta _{\frac{1}{\sigma (d_n)}[-\log (q^{(n)}_i)-h(d_n)]}. \end{aligned}$$

1.
Suppose $\Vert p^{(n)} - q^{(n)}\Vert _1$ goes to zero as n goes to infinity, and $\nu ^{(n)}_p \Rightarrow \nu $ for some probability distribution $\nu $. Then, $\nu ^{(n)}_{q} \Rightarrow \nu $.
2.
Suppose the vectors $q^{(n)}$ and $p^{(n)}$ are random, $\mathbbm {E}\Vert p^{(n)} - q^{(n)}\Vert _1$ goes to zero as n goes to infinity, and $\nu ^{(n)}_p \Rightarrow \nu $ in probability. Then, $\nu ^{(n)}_{q} \Rightarrow \nu $ in probability.

Proof

We need to show that

$$\begin{aligned} \left|\int f(x) \textrm{d}\nu _q^{(n)}(x) - \int f(x) \textrm{d}\nu (x)\right| \rightarrow 0 \end{aligned}$$

for any $f \in C_b(\mathbbm {R})$. In fact, it suffices to show this for f uniformly continuous (see, for instance, Theorem C.10 in [3]). Let $\varepsilon > 0$, then if we assume f to be uniformly continuous, there exists $\delta > 0$ such that for any x, y for which $\left|x - y\right| \le \delta $, it holds that $\left|f(x) - f(y)\right| \le \varepsilon $. We use a triangle inequality to bound

$$\begin{aligned} \left|\int f(x) \textrm{d}\nu _q^{(n)}(x) - \int f(x) \textrm{d}\nu (x)\right|&\le \left|\int f(x) \textrm{d}\nu _q^{(n)}(x) - \int f(x) \textrm{d}\nu _p^{(n)}(x)\right|\nonumber \\&\quad + \left|\int f(x) \textrm{d}\nu _p^{(n)}(x) - \int f(x) \textrm{d}\nu (x)\right|. \end{aligned}$$

(4.16)

Since $\nu _p^{(n)} \Rightarrow \nu $, the second term on the right-hand side vanishes as n goes to infinity. If we write $g_n(x) = \frac{1}{\sigma (d_n)}(\log \frac{1}{x} - h(d_n))$, then the first term on the right-hand side of Eq. (4.16) is given by

$$\begin{aligned} \left|\sum _{i = 1}^{d_n} q_i^{(n)} f(g_n(q_i^{(n)})) - p_i^{(n)} f(g_n(p_i^{(n)}))\right|. \end{aligned}$$

This can be bounded by

$$\begin{aligned} \sum _{i = 1}^{d_n} \left|q_i^{(n)} f(g_n(q_i^{(n)})) - p_i^{(n)} f(g_n(p_i^{(n)}))\right|&\le \sum _{i = 1}^{d_n} \left|q_i^{(n)} - p_i^{(n)}\right| \left|f(g_n(q_i^{(n)}))\right| \\&\quad + \sum _{i = 1}^{d_n} p_i^{(n)} \left|f(g_n(q_i^{(n)})) - f(g_n(p_i^{(n)}))\right| \end{aligned}$$

In this expression, the first term is bounded by

$$\begin{aligned} \sum _{i = 1}^{d_n} \left|q_i^{(n)} - p_i^{(n)}\right| \left|f(g_n(q_i^{(n)}))\right| \le C\Vert q^{(n)} - p^{(n)}\Vert _1 \end{aligned}$$

(4.17)

where C is a uniform upper bound for f (using $f \in C_b(\mathbbm {R})$). For the second term, we partition the sum over $[d_n]$ into three sets

$$\begin{aligned} I_1&= \left\{ i \in [d_n] \text { such that } 2^{-\delta \sigma (d_n)} \le \frac{p_i^{(n)}}{q_i^{(n)}} \le 2^{\delta \sigma (d_n)} \right\} \\ I_2&= \left\{ i \in [d_n] \text { such that } \frac{p_i^{(n)}}{q_i^{(n)}} > 2^{\delta \sigma (d_n)} \right\} \\ I_3&= \left\{ i \in [d_n] \text { such that } \frac{p_i^{(n)}}{q_i^{(n)}} < 2^{-\delta \sigma (d_n)} \right\} . \end{aligned}$$

The idea is that for $i \in I_1$, $p_i$ and $q_i$ are sufficiently close that we may use the continuity of f, whereas $I_2$ and $I_3$ cannot have too much weight (using that $p^{(n)}$ and $q^{(n)}$ are close in trace distance). Let us now make this precise. For the sum over the elements in $I_1$, we have

$$\begin{aligned} \sum _{i \in I_1} p_i^{(n)} \left|f(g_n(q_i^{(n)})) - f(g_n(p_i^{(n)}))\right| \le \sum _{i \in I_1} p_i^{(n)} \varepsilon \le \Vert p^{(n)}\Vert _1\varepsilon , \end{aligned}$$

(4.18)

using the uniform continuity of f and the fact that for $i \in I_1$, we have $g(q_i^{(n)}) - g(p_i^{(n)}) = \frac{1}{\sigma (d_n)}\log \left( \frac{p_i^{(n)}}{q_i^{(n)}}\right) $, implying $\left|g(q_i^{(n)}) - g(p_i^{(n)})\right| \le \delta $ by the definition of $I_1$. Next, we observe that

$$\begin{aligned} \Vert p^{(n)} - q^{(n)}\Vert _1&\ge \sum _{i \in I_2} \left|p_i^{(n)} - q_i^{(n)}\right| = \sum _{i \in I_2} p_i^{(n)}\left( 1 - \frac{q_i^{(n)}}{p_i^{(n)}}\right) \\ {}&\ge \sum _{i \in I_2} p_i^{(n)}\left( 1 - 2^{-\delta \sigma (d_n)}\right) , \end{aligned}$$

and hence for $\sigma (d_n) \ge \delta ^{-1}$, we have

$$\begin{aligned} \sum _{i \in I_2} p_i^{(n)} \le 2\Vert p^{(n)} - q^{(n)}\Vert _1. \end{aligned}$$

(4.19)

In analogous fashion, we see that

$$\begin{aligned} \Vert p^{(n)} - q^{(n)}\Vert _1&\ge \sum _{i \in I_3, p_i^{(n)}> 0} \left|p_i^{(n)} - q_i^{(n)}\right| = \sum _{i \in I_3, p_i^{(n)} > 0} p_i^{(n)}\left( \frac{q_i^{(n)}}{p_i^{(n)}} - 1\right) \\&\ge \sum _{i \in I_3} p_i^{(n)}\left( 2^{\delta \sigma (d_n)} - 1\right) , \end{aligned}$$

so for $\sigma (d_n) \ge \delta ^{-1}$ we have

$$\begin{aligned} \sum _{i \in I_3} p_i^{(n)} \le \Vert p^{(n)} - q^{(n)}\Vert _1. \end{aligned}$$

(4.20)

In conclusion, for $\sigma (d_n) \ge \delta ^{-1}$, collecting Eqs. (4.18), (4.19), (4.20) and using that f is uniformly bounded by C, we can bound

$$\begin{aligned} \sum _{i = 1}^{d_n} p_i^{(n)} \left|f(g_n(q_i^{(n)})) - f(g_n(p_i^{(n)}))\right| \le \varepsilon \Vert p^{(n)}\Vert _1 + 3C\Vert p^{(n)} - q^{(n)}\Vert _1. \end{aligned}$$

Together with Eq. (4.17), this implies that for $\sigma (d_n) \ge \delta $, we obtain the bound:

$$\begin{aligned} \left|\int f(x) \textrm{d}\nu _q^{(n)}(x) - \int f(x) \textrm{d}\nu _p^{(n)}(x)\right| \le 4C\Vert q^{(n)} - p^{(n)}\Vert _1 + \varepsilon \Vert p^{(n)}\Vert _1. \end{aligned}$$

(4.21)

Since $\nu _{p}^{(n)} \Rightarrow \nu $, $\Vert p^{(n)}\Vert _1 \rightarrow 1$, and since $\varepsilon > 0$ was arbitrary, we conclude that as n goes to infinity

$$\begin{aligned} \left|\int f(x) \textrm{d}\nu _q^{(n)}(x) - \int f(x) \textrm{d}\nu (x)\right| \rightarrow 0 \end{aligned}$$

proving 1.

To prove 2, we note that by Markov’s inequality, it suffices to show that

$$\begin{aligned} \mathbbm {E}\left|\int f(x) \textrm{d}\nu _q^{(n)}(x) - \int f(x) \textrm{d}\nu (x)\right| \rightarrow 0, \end{aligned}$$

(4.22)

where $f \in C_b(\mathbbm {R})$ is a uniformly continuous function. By Eq. (4.16), it suffices to show

$$\begin{aligned} \mathbbm {E}\left|\int f(x) \textrm{d}\nu _q^{(n)}(x) - \int f(x) \textrm{d}\nu _p^{(n)}(x)\right| \rightarrow 0. \end{aligned}$$

Let $\varepsilon > 0$, and let $\delta $ such that if $\left|x - y\right| \le \delta $, $\left|f(x) - f(y)\right| \le \varepsilon $. Then by Eq. (4.21), for $\sigma (d_n) \ge \delta ^{-1}$,

$$\begin{aligned} \mathbbm {E}\left|\int f(x) \textrm{d}\nu _q^{(n)}(x) - \int f(x) \textrm{d}\nu _p^{(n)}(x)\right| \le 4C \mathbbm {E}\Vert q^{(n)} - p^{(n)}\Vert _1 + \varepsilon \mathbbm {E}\Vert p^{(n)}\Vert _1. \end{aligned}$$

Since $\varepsilon > 0$ was arbitrary, $\mathbbm {E}\Vert q^{(n)} - p^{(n)}\Vert _1$ goes to zero, and $\mathbbm {E}\Vert p^{(n)}\Vert _1$ goes to one, we conclude that Eq. (4.22) holds, proving 2. $\square $

The value of this result is clear: So long as we restrict ourselves to measures of the form Eq. (4.15), then we can prove convergence results by proving convergence of trace norms.

We will also need a basic lemma to help estimate overlaps of states.

Lemma 1.15

Suppose we have two bipartite pure states $\psi _{AB}, \phi _{AB} \in {\mathcal {P}}(AB)$. Then,

$$\begin{aligned} \Vert {{\,\textrm{tr}\,}}_{B}[\vert \psi \rangle \langle \phi \vert ]\Vert _1 = F(\psi _B, \phi _B). \end{aligned}$$

Proof

For any operator $M_A$ on ${\mathcal {H}}_A$ it holds that

$$\begin{aligned} \Vert M_A\Vert _1 = \sup _{U_A} \, \left|{{\,\textrm{tr}\,}}[U_A M_A]\right| \end{aligned}$$

where the supremum is over unitary operators $U_A$ on ${\mathcal {H}}_A$, so

$$\begin{aligned} \Vert {{\,\textrm{tr}\,}}_{B}[\vert \psi \rangle \langle \phi \vert ]\Vert _1&= \sup _{U_A} \, \left|{{\,\textrm{tr}\,}}[U_A{{\,\textrm{tr}\,}}_{B}[\vert \psi \rangle \langle \phi \vert ]]\right|\\&= \sup _{U_A} \, \left|{{\,\textrm{tr}\,}}[U_A \otimes I_B \vert \psi \rangle \langle \phi \vert ]\right| = \sup _{U_A} \, \left|\langle \phi \vert U_A \otimes I_B \vert \psi \rangle \right| \end{aligned}$$

which equals $F(\psi _B, \phi _B)$ by Uhlmann’s theorem. $\square $

With these tools in hand, let us now return to the random tensor network setting. Let $\Gamma _{A,1}$ and $\Gamma _{A,2}$ be two cuts with associated sets of edges $\gamma _{A,1}$ and $\gamma _{A,2}$. We will assume these two cuts are non-intersecting: $\gamma _{A,1} \cap \gamma _{A,2} = \emptyset $, and, without loss of generality $\Gamma _{A,1} \subset \Gamma _{A,2}$. We will assume that the full state on edges is a tensor product of link states along the edges, and that the associated central limit measure for the spectrum along the cuts $\gamma _{A,1}$ and $\gamma _{A,2}$ converges to a continuous probability distribution. The most obvious application is where the link states on each edge $\phi _e$ are many copies of some single state $\phi _{e,0}$, so $\phi _e = (\phi _{e,0})^{\otimes n}$, in which case the spectrum is subject to a central limit theorem and the assumptions of Theorem 4.12 are satisfied.

We will now formalize what it means for these two cuts to be competing minimal cuts. We have fixed $\Gamma _{A,1}$ and $\Gamma _{A,2}$ with $\Gamma _{A,1} \subsetneq \Gamma _{A,2}$. Then, we consider three sets of cuts: ${\mathcal {C}}_1(\Gamma _{A,1}, \Gamma _{A,2})$ is the collection of cuts strictly contained in $\Gamma _{A,1}$, ${\mathcal {C}}_2(\Gamma _{A,1}, \Gamma _{A,2})$ is the collection of cuts which strictly contain $\Gamma _{A,2}$, and ${\mathcal {C}}_3(\Gamma _{A,1}, \Gamma _{A,2})$ is the collection of cuts which are “in between” $\Gamma _{A,1}$ and $\Gamma _{A,2}$. That is,

$$\begin{aligned} {\mathcal {C}}_1(\Gamma _{A,1}, \Gamma _{A,2})&= \{ \Delta _A \in C(A) : \Delta _A \subsetneq \Gamma _{A,1}\}\\ {\mathcal {C}}_2(\Gamma _{A,1}, \Gamma _{A,2})&= \{ \Delta _A \in C(A) : \Gamma _{A,2} \subsetneq \Delta _A\}\\ {\mathcal {C}}_3(\Gamma _{A,1}, \Gamma _{A,2})&= \{ \Delta _A \in C(A) : \Gamma _{A,1} \subsetneq \Delta _A \subsetneq \Gamma _{A,2}\} \end{aligned}$$

and we let ${\mathcal {C}} = {\mathcal {C}}_1(\Gamma _{A,1}, \Gamma _{A,2}) \cup {\mathcal {C}}_2(\Gamma _{A,1}, \Gamma _{A,2}) \cup {\mathcal {C}}_3(\Gamma _{A,1}, \Gamma _{A,2})$. As in the single cut case, we would like to say that the surfaces $\Gamma _{A,1}$ and $\Gamma _{A,2}$ are minimal cuts if an appropriate set of conditional entropies are sufficiently large for all $\Delta _A \in {\mathcal {C}}$. We formalize this as follows:

Definition 1.16

A pair of cuts $\Gamma _{A,1}$ and $\Gamma _{A,2}$ for A is a pair of $(\varepsilon ,K)$-minimal cuts if

$$\begin{aligned} H_{\min }^\varepsilon (\Gamma _{A,1} {\setminus } \Delta _A \vert \Gamma _{A,1}^c) \ge K \end{aligned}$$

for all cuts $\Delta _A \in {\mathcal {C}}_1(\Gamma _{A,1}, \Gamma _{A,2})$

$$\begin{aligned} H_{\min }^\varepsilon (\Delta _A {\setminus } \Gamma _{A,2}\vert \Gamma _{A,2}) \ge K \end{aligned}$$

for all cuts $\Delta _A \in {\mathcal {C}}_2(\Gamma _{A,1}, \Gamma _{A,2})$ and

$$\begin{aligned} H_{\min }^\varepsilon (\Delta _A {\setminus } \Gamma _{A,1}\vert \Gamma _{A,1})&\ge K \\ H_{\min }^\varepsilon (\Gamma _{A,2} {\setminus } \Delta _A \vert \Gamma _{A,2}^c)&\ge K \end{aligned}$$

for all cuts $\Delta _A \in {\mathcal {C}}_3(\Gamma _{A,1}, \Gamma _{A,2})$.

We will need a joint smoothing result similar to Lemma 4.5 for the particular case where $\Gamma _{A,1} = A$ and $\Gamma _{A,2}^c = {{\bar{A}}}$. To this end we consider a graph $G = (V,E)$ with a set of boundary vertices $V_\partial $, and a boundary subsystem $A \subseteq V_\partial $. We denote by $\gamma _{A,1}$ the edges incident to A, and $\gamma _{A,2}$ the edges incident to ${{\bar{A}}}$, and assume these sets do not intersect. We let $E_b = E {\setminus } (\gamma _{A,1} \cup \gamma _{A,2})$. For a cut $\Delta _A \in C(A)$ we let $Y^{\Delta _A}$ be the set of half-edges

$$\begin{aligned} Y^{\Delta _A} = \{(e,x) : e = (xy), x \in \Delta _A^c, y \in A \}. \end{aligned}$$

Lemma 1.17

Let $\phi \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(V)$ be a pure background state which is of the form $\phi = \phi _{\gamma _{A,1}} \otimes \phi _{\gamma _{A,2}} \otimes \phi _{E_b}$, where

$$\begin{aligned} \phi _{E_b}&= \bigotimes _{e \in E_b} \phi _e \end{aligned}$$

is a product state and the $\phi _{\gamma _{A,i}}$ have a Schmidt decomposition in the standard basis along the half-edges. Then, there exists a state $\phi ^{\varepsilon }$ which is such that

$$\begin{aligned} P(\phi ,\phi ^{\varepsilon }) \le 2\sqrt{2^{V_b}\varepsilon } \end{aligned}$$

and for all cuts $\Delta _A \in C(A)$ it holds that

$$\begin{aligned} H_{\min }(\Delta _A {\setminus } A \vert A)_{\phi ^\varepsilon } \ge H_{\min }^{\varepsilon }(\Delta _A {\setminus } A \vert A Y^{\Delta _A})_{\phi }. \end{aligned}$$

To state the main result of this section, recall that given a measurable function $f: {\mathcal {X}} \rightarrow {\mathcal {Y}}$ between measure spaces, the pushforward of the function f on a measure $\mu $ on ${\mathcal {X}}$ is defined by $(f_*(\mu ))(A) = \mu (f^{-1})(A)$ for any measureable set $A \subseteq {\mathcal {Y}}$. We apply this to the function $\min : \mathbbm {R}\times \mathbbm {R}\rightarrow \mathbbm {R}$ in the result:

Theorem 1.18

Consider a family of random tensor network states on a graph G with pure state on edges $\phi $, indexed by an increasing sequence of positive integers n. We assume that $\Gamma _{A,1}, \Gamma _{A,2} \in C(A)$ are a pair of non-intersecting $(\varepsilon (n),K(n))$-minimal cuts for all n where $\varepsilon (n) = {\mathcal {O}}(n^{-\gamma })$ for $\gamma > 4$ and $K(n) = \Omega (\log (n)^2)$ as n goes to infinity. We let $H: \mathbbm {N}\rightarrow \mathbbm {R}$, and we assume that

$$\begin{aligned} \nu _{\phi _{\Gamma _{A,i}}} = \sum _{\lambda \in {{\,\textrm{spec}\,}}(\phi _{\Gamma _{A,i}})} \lambda \delta _{\frac{1}{\sqrt{n}}[\log (\frac{1}{\lambda }) - H(n)]} \end{aligned}$$

for $i = 1,2,$ is such that $\nu _{\phi _{\Gamma _{A,i}}} \Rightarrow \nu _i$, where $\nu _i$ is a probability distribution with a continuous cumulative distribution function. Then, $\nu _{\rho _A} \Rightarrow \textrm{min}_*(\nu _1,\nu _2)$ in probability.

Proof

Proving this result will require several intermediate results. We provide a very high-level, enumerated sketch of our proof here, involving the following steps:

1.
Study a reduced problem on a subnetwork; this subnetwork is such that the minimal cuts $\gamma _{A,i}$ are incident to the boundary.
2.
Approximate the background state with superpositions of maximally entangled states by binning eigenvalues, and construct approximate tensor network states with the approximate background states.
3.
Prove that the spectrum of the boundary state converges to the spectrum of the approximate tensor network state, which in turn converges to the spectrum of the approximate background states. We write the background state as a superposition of two states, one of which has $\gamma _{A,1}$ as its minimal cut, whereas the other state has $\gamma _{A,2}$ as its minimal cut. We show that the resulting states are approximately orthogonal.
4.
Show that the distribution of the approximate background states converges to the min-distribution, and hence conclude that the spectrum of the boundary state converges to the min-distribution.

Figure 5 provides a more detailed visual sketch of the intuition behind our proof strategy. Lemma 4.8 will be a key tool, as it implies that it will suffice to show convergence in trace norm.

We assume without loss of generality that $\Gamma _{A,1} \subset \Gamma _{A,2}$. We now define the subnetwork that we will analyze in our proof. Let $G' = (V',E')$ be the induced graph on $V' = V_b' \cup V_\partial '$, where $V_b' = \Gamma _{A,2} {\setminus } \Gamma _{A,1}$, and $V_\partial ' = B \cup {\bar{B}}$, with B the set of vertices in $\Gamma _{A,1}$ which are incident to $V_b'$ and ${\bar{B}}$ the set of vertices in $V {\setminus } \Gamma _{A,2}$ which are incident to $V_b'$. The subgraph $G'$ is depicted in Fig. 5b.

We also define the random tensor network state $\tau $ as the state obtained by applying random tensors only on bulk vertices in the complement of $V_b'$. Then for this state, by a slight variation on Theorem 4.7, it holds that

$$\begin{aligned} \mathbbm {E}\min _{V_A, V_{{\bar{A}}}} \Vert (V_A \otimes V_{{\bar{A}}} \otimes I)\tau (V_A^\dagger \otimes V_{{\bar{A}}}^\dagger \otimes I) - \phi \Vert _1 = {\mathcal {O}}(2^{-K(n)} + \sqrt{\varepsilon (n)}), \end{aligned}$$

where the minimum is over isometries $V_A: {\mathcal {H}}_A \rightarrow {\mathcal {H}}_{\Gamma _{A,1}}$ and $V_{{\bar{A}}}: {\mathcal {H}}_{{\bar{A}}} \rightarrow {\mathcal {H}}_{\Gamma _{A,2}^c}$. Let ${{\tilde{\tau }}} = (V_A \otimes V_{{\bar{A}}} \otimes I)\tau (V_A^\dagger \otimes V_{{\bar{A}}}^\dagger \otimes I)$ where the $V_A$ and $V_{{\bar{A}}}$ are the isometries that realize the minimum.

Now, consider two random tensor network states on $G'$ obtained by applying the same random tensors on $V_b'$ with background states ${{\tilde{\tau }}}$ and $\phi $, respectively. The state where we take ${{\tilde{\tau }}}$ as the background state yields $(V_A \otimes V_{{\bar{A}}} \otimes I)\rho (V_A^\dagger \otimes V_{{\bar{A}}}^\dagger \otimes I)$. Denote the state where we take $\phi $ as the background state by $\sigma $. Then by Lemma 4.3:

$$\begin{aligned} \mathbbm {E}\Vert (V_A \otimes V_{{\bar{A}}})\rho (V_A^\dagger \otimes V_{{\bar{A}}}^\dagger ) - \sigma \Vert _1 = {\mathcal {O}}(2^{-K(n)} + \sqrt{\varepsilon (n)}). \end{aligned}$$

If we let $E_1$ denote the set of edges (xy) for which both $x,y \in \Gamma _{A,1}$, and $E_2$ the set of edges (xy) for which both $x,y \in \Gamma _{A,2}^c$, then $\sigma _{\Gamma _{A,1}} = \phi _{E_1} \otimes \sigma _B$ and $\sigma _{\Gamma _{A,2}} = \phi _{E_2} \otimes \sigma _{{\bar{B}}}$. This shows that

$$\begin{aligned} \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\sigma _{B})\Vert _1 = {\mathcal {O}}(2^{-K(n)} + \sqrt{\varepsilon (n)}) \end{aligned}$$

(4.23)

and ${{\,\textrm{spec}\,}}_+(\sigma _B) = {{\,\textrm{spec}\,}}_+(\sigma _{{\bar{B}}})$, because $\sigma $ is pure, which follows because its background state $\phi $ is pure. We will continue to study $\sigma $ on the reduced graph $G'$, as in Fig. 5b, and at the end of the proof, we will see that Eq. (4.23) will be sufficient to prove the desired properties of ${{\,\textrm{spec}\,}}(\rho _A)$.

We have accomplished Step 1 by reducing the full network to a subnetwork, and we now try to construct an approximation to $\sigma $. We will do so by coarse-graining the spectrum of the background states along the two minimal cuts: slicing the tails of the distribution, binning the remaining eigenvalues, throwing away the smallest bins of the binned distribution, and then approximating the states as a superposition of maximally entangled states defined on the bins. Consider the background states along the two minimal cuts in the Schmidt basis:

$$\begin{aligned} \vert \phi _{\gamma _{A,i}}\rangle = \sum _{J} \sqrt{\lambda _{i,J}}\vert JJ\rangle . \end{aligned}$$

(4.24)

Here, the $\vert J\rangle $ are a tensor product basis along the half-edges (which we again may take to be the standard basis), so that

$$\begin{aligned} \vert JJ\rangle = \bigotimes _{e = (xy) \in \gamma _{A,i}} \vert i_{e,x}\rangle \otimes \vert i_{e,y}\rangle , \end{aligned}$$

where the $\vert i_{e,x}\rangle $ and $\vert i_{e,y}\rangle $ are a basis for ${\mathcal {H}}_{e,x}$ and ${\mathcal {H}}_{e,y}$. First, we truncate the allowed range of eigenvalues by slicing off the tails of the distribution, as in the left side of Fig. 5c. Let

$$\begin{aligned} I_n = [2^{-H(n)-C\sqrt{n}\log (n)}, 2^{-H(n)+C\sqrt{n}\log (n)}] \end{aligned}$$

for some constant C, and let

$$\begin{aligned} \vert \phi _{\gamma _{A,i}}^{(1)}\rangle = \sum _{\lambda _{i,J} \in I_n} \sqrt{\lambda _{i,J}}\vert JJ\rangle . \end{aligned}$$

Note that the entanglement spectrum of $\phi _{\gamma _{A,i}}$ is given by the spectrum of $\phi _{\Gamma _{A,i}}$. This implies that

$$\begin{aligned} \sum _{\lambda _{i,J} \notin I_n} \lambda _{i,J} = \nu _{\phi _{\Gamma _{A,i}}}((-\infty ,-C\log (n))\cup (C\log (n),\infty )) \rightarrow 0 \end{aligned}$$

and hence

$$\begin{aligned} \Vert \phi _{\gamma _{A,i}} - \phi _{\gamma _{A,1}}^{(1)}\Vert _1 \rightarrow 0. \end{aligned}$$

Next, for $\lambda _{i,J} \in I_n$, we define new ${\tilde{\lambda }}_{i,J}$ according to $\log ({\tilde{\lambda }}_{i,J}) = \frac{1}{n^{\alpha }}\lfloor n^{\alpha }\log (\lambda _{i,J})\rfloor $, effectively binning the values of $\log (\lambda _{i,J})$ into intervals of size $n^{-\alpha }$, as in the right side of Fig. 5c. We will choose $\alpha > 0$ small (but other choices will be useful for Corollary 4.14); to be precise, we choose $\alpha < \frac{1}{8}(\gamma - 4)$. We can now define the background state

$$\begin{aligned} \vert \phi _{\gamma _{A,i}}^{(2)}\rangle = \sum _{\lambda _{i,J} \in I_n} \sqrt{{{\tilde{\lambda }}}_{i,J}}\vert JJ\rangle . \end{aligned}$$

Then by Lemma C.4

$$\begin{aligned} \Vert \phi _{\gamma _{A,i}}^{(1)}- \phi _{\gamma _{A,i}}^{(2)}\Vert _1&\le 2\sqrt{\sum _{J} \left|\lambda _{i,J} - {{\tilde{\lambda }}}_{i,J}\right|}, \end{aligned}$$

which we may bound using the fact that $2^{-\frac{1}{n^{\alpha }}} \le \frac{{{\tilde{\lambda }}}_{i,J}}{\lambda _{i,J}} \le 1$, so

$$\begin{aligned} \sum _{J} \left|\lambda _{i,J} - {{\tilde{\lambda }}}_{i,J}\right|&= \sum _J \lambda _{i,J}\left( 1 - \frac{{{\tilde{\lambda }}}_{i,J}}{\lambda _{iJ}}\right) \le \max _{J} \left( 1 - \frac{{{\tilde{\lambda }}}_{i,J}}{\lambda _{i,J}}\right) \\ {}&\le 1 - 2^{-\frac{1}{n^{\alpha }}} = {\mathcal {O}}\left( \frac{1}{n^{\alpha }}\right) . \end{aligned}$$

Thus, $\Vert \phi _{\gamma _{A,i}}^{(1)}- \phi _{\gamma _{A,i}}^{(2)}\Vert _1 = {\mathcal {O}}(n^{-\frac{1}{2} \alpha })$. Since the interval $I_n$ has length ${\mathcal {O}}(\sqrt{n}\log (n))$ and the distinct values ${{\tilde{\lambda }}}_{i,J}$ are $n^{-{\alpha }}$ apart, the ${\tilde{\lambda }}_{i,J}$ take ${\mathcal {O}}(n^{{\alpha } + \frac{1}{2}}\log (n))$ different values. Denote these values by $p_{i,j}$, and let $d_{i,j}$ be their multiplicities. Setting

$$\begin{aligned} q_{i,j} = d_{i,j}p_{i,j}, \end{aligned}$$

we can rewrite the state $\vert \phi _{\gamma _{A,i}}^{(2)}\rangle $ using the collected eigenvalues:

$$\begin{aligned} \vert \phi _{\gamma _{A,i}}^{(2)}\rangle = \sum _{j} \sqrt{q_{i,j}}\vert \Phi ^+_{i,j}\rangle , \end{aligned}$$

where the $\vert \Phi ^+_{i,j}\rangle $ are maximally entangled states of dimension $d_{i,j}$ which are orthogonal, i.e., $\langle \Phi ^+_{i,j}|\Phi ^+_{i,k}\rangle = \delta _{j,k}$. Note that $0 \le q_{i,j} \le 1$ and $\sum _j q_{i,j} \le 1$. Now, we discard any bins that are too small: Let $\beta > \alpha + \frac{1}{2}$ and consider the state

$$\begin{aligned} \vert \phi _{\gamma _{A,i}}^{(3)}\rangle = \sum _{q_{i,j} > n^{-\beta }} \sqrt{q_{i,j}}\vert \Phi ^+_{i,j}\rangle \end{aligned}$$

(4.25)

where we restricted the sum to terms for which $q_{i,j}$ is sufficiently large. (And hence $p_{i,j}$ will be sufficiently close to $d_{i,j}^{-1}$.) Then, since the number of terms is ${\mathcal {O}}(n^{\alpha + \frac{1}{2}}\log (n))$ and using Lemma C.4, we have

$$\begin{aligned} \Vert \phi _{\gamma _{A,i}}^{(2)} - \phi _{\gamma _{A,i}}^{(3)}\Vert _1 ={\mathcal {O}}\left( n^{\frac{1}{2}\alpha + \frac{1}{4} - \frac{1}{2}\beta }\sqrt{\log (n)}\right) \end{aligned}$$

To summarize, $\phi _{\gamma _{A,i}}^{(3)}$ is the state obtained from the original background state $\phi _{\gamma _{A,i}}$ by 1) removing the tails of the spectrum, 2) binning the eigenvalues, and finally 3) dropping any of the bins that are too small. The first approximation incurs an error $\Vert \phi _{\gamma _{A,i}} - \phi _{\gamma _{A,i}}^{(1)}\Vert _1$, which converges to zero, and the second and third approximations incur errors of order ${\mathcal {O}}(n^{-\frac{1}{2}\alpha })$ and ${\mathcal {O}}\left( n^{\frac{1}{2}\alpha + \frac{1}{4} - \frac{1}{2}\beta }\sqrt{\log (n)}\right) $, respectively.

Now, let ${{\tilde{\phi }}}$ be the background state for which we have replaced $\phi _{\gamma _{A,i}}$ by $\phi _{\gamma _{A,i}}^{(3)}$ for $i=1,2$. Recall that $\sigma $ is the state on the random tensor network state constructed with background state $\phi $ on the subgraph $G'$, as in Fig. 5b. We then define the approximation ${{\tilde{\sigma }}}$ to be the tensor network state on $G'$ which uses ${{\tilde{\phi }}}$ as its background state instead of $\phi $. With these background states, we see

$$\begin{aligned} \Vert \phi - {{\tilde{\phi }}}\Vert _1 = {\mathcal {O}}\left( \Vert \phi _{\gamma _{A,i}} - \phi _{\gamma _{A,i}}^{(1)}\Vert _1 + n^{-\frac{1}{2}\alpha } + n^{\frac{1}{2}\alpha + \frac{1}{4} - \frac{1}{2}\beta }\sqrt{\log (n)} \right) \rightarrow 0. \end{aligned}$$

(4.26)

We then apply Lemma 4.3 to find:

$$\begin{aligned} \mathbbm {E}\Vert \sigma - {{\tilde{\sigma }}}\Vert _1 = {\mathcal {O}}\left( \Vert \phi - {{\tilde{\phi }}}\Vert _1\right) \rightarrow 0. \end{aligned}$$

(4.27)

We now make one final approximation to $\sigma $, in which we discard the parts of the background state where the maximally entangled states along each cut are close in dimension, as in Fig. 5d. Consider the state

$$\begin{aligned} \vert {{\bar{\phi }}}_{\gamma _{A,1},\gamma _{A,2}}\rangle = \sum _{2^{n^{\nicefrac {1}{4}}}p_{2,k} \le p_{2,j} \text { or } 2^{-n^{\nicefrac {1}{4}}}p_{2,k} \ge p_{2,j}}\sqrt{q_{1,k}q_{2,j}} \vert \Phi _{1,k}\rangle \otimes \vert \Phi _{2,j}\rangle , \end{aligned}$$

where the sum is still only over those j and k for which $q_{1,k} > n^{-\beta }$ and $q_{2,j} > n^{-\beta }$. If we let

$$\begin{aligned} D_n = \{(x,y) \in \mathbbm {R}^2 : x - n^{-\frac{1}{4}} - n^{-\frac{1}{2}} \le y \le x + n^{-\frac{1}{4}} + n^{-\frac{1}{2}}, \left|x\right| \le \log (n) + 1\}, \end{aligned}$$

then as n increases, the measure of $D_n$ converges to zero and since the measures $\nu _{\phi _{\gamma _{A,i}}} \Rightarrow \nu _i$ and $\nu _i$ has continuous cumulative distribution function

$$\begin{aligned} \sum _{2^{-n^{\nicefrac {1}{4}}}p_{1,j} \le p_{2,k} \le 2^{n^{\nicefrac {1}{4}}}p_{1,j}} q_{1,j}q_{2,k} \le (\nu _{\phi _{\Gamma _{A,1}}} \times \nu _{\phi _{\Gamma _{A,2}}}) (D_n)\rightarrow 0. \end{aligned}$$

(4.28)

Hence, if we let ${{\bar{\phi }}}$ denote the background state with ${\tilde{\phi }}_{\gamma _{A,1}}\otimes {\tilde{\phi }}_{\gamma _{A,2}}$ replaced by ${{\bar{\phi }}}_{\gamma _{A,1},\gamma _{A,2}}$, we get that $\Vert {{\bar{\phi }}} - {{\tilde{\phi }}}\Vert _1 \rightarrow 0$. By Lemma 4.3, if we denote by ${{\bar{\sigma }}}$ the state we obtain on $G'$ by using ${{\bar{\phi }}}$ rather than ${{\tilde{\phi }}}$ as the background state, we get

$$\begin{aligned} \mathbbm {E}\Vert {{\bar{\sigma }}} - \sigma \Vert _1 \rightarrow 0. \end{aligned}$$

(4.29)

We pause here to note that we have accomplished Step 2: We have constructed an approximation ${{\bar{\sigma }}}$ to $\sigma $ by approximating the background state along the cuts as superpositions of maximally entangled states. The utility in doing so is that working with this approximated tensor network state allows us to reduce to calculations where we restrict to a maximally entangled state along one of the two surfaces. Our next major step is to then show that the spectrum of ${{\bar{\sigma }}}$ can be used to approximate the spectrum of $\rho $, which will then allow us to analyze the convergence of the corresponding distribution. As a first intermediate step, we will show that ${\bar{\sigma }}$ can be approximated as a superposition of a state with minimal cut $\gamma _{A,1}$ and a state with minimal cut $\gamma _{A,2}$. This will allow us to more easily reason about how the background states are related to the spectrum of ${\bar{\sigma }}$, in turn, $\sigma $, and in turn, $\rho _A$.

Let us write

$$\begin{aligned} \vert \phi _{\gamma _{A,1},\gamma _{A,2}}^{(j,a)}\rangle&= \sum _{k \text { s.t. } p_{1,k} \ge p_{2,j}2^{n^{\nicefrac {1}{4}}}} \sqrt{q_{1,k} q_{2,j}} \vert \Phi ^+_{1,k}\rangle \otimes \vert \Phi ^+_{2,j}\rangle \\ \vert \phi _{\gamma _{A,1},\gamma _{A,2}}^{(j,b)}\rangle&= \sum _{k \text { s.t. } p_{2,k} \ge p_{1,j}2^{n^{\nicefrac {1}{4}}}} \sqrt{q_{1,j} q_{2,k}} \vert \Phi ^+_{1,j}\rangle \otimes \vert \Phi ^+_{2,k}\rangle , \end{aligned}$$

allowing us to write the background state ${{\bar{\phi }}}_{\gamma _{A,1},\gamma _{A,2}}$ as a different superposition, as depicted in Fig. 5e:

$$\begin{aligned} \vert {{\bar{\phi }}}_{\gamma _{A,1},\gamma _{A,2}}\rangle&= \sum _{j} \vert \phi _{\gamma _{A,1},\gamma _{A,2}}^{(j,a)}\rangle + \sum _j \vert \phi _{\gamma _{A,1},\gamma _{A,2}}^{(j,b)}\rangle . \end{aligned}$$

Let $\vert \phi ^{(j,a)}\rangle $ and $\vert \phi ^{(j,b)}\rangle $ denote the background states on $G'$ where we have replaced $\vert {{\bar{\phi }}}_{\gamma _{A,1},\gamma _{A,2}}\rangle $ by $\vert \phi _{\gamma _{A,1},\gamma _{A,2}}^{(j,a)}\rangle $ and $\vert \phi _{\gamma _{A,1},\gamma _{A,2}}^{(j,b)}\rangle $, respectively, and let

$$\begin{aligned} \vert \phi ^{a}\rangle = \sum _j \vert \phi ^{(j,a)}\rangle , \qquad \vert \phi ^{b}\rangle = \sum _j \vert \phi ^{(j,b)}\rangle . \end{aligned}$$

Denote by $\vert \sigma ^{(j,a)}\rangle $, $\vert \sigma ^{(j,b)}\rangle $ the random tensor network states on $G'$ with background states $\vert \phi ^{(j,a)}\rangle $ and $\vert \phi ^{(j,b)}\rangle $, respectively. Similarly, denote by $\vert \sigma ^{(a)}\rangle $ and $\vert \sigma ^{(b)}\rangle $ the random tensor network states on $G'$ with background states $\vert \phi ^{(a)}\rangle $ and $\vert \phi ^{(b)}\rangle $, respectively.

We start with the following bound on the rank of $\sigma ^{(j,a)}_B$:

$$\begin{aligned} \begin{aligned} {{\,\textrm{rank}\,}}(\sigma ^{(j,a)}_B)&\le {{\,\textrm{rank}\,}}(\phi ^{(j,a)}_B) \\&\le \sum _{p_{1,k} \ge p_{2,j}2^{n^{\nicefrac {1}{4}}}} d_{1,k} \\&\le \sum _{p_{1,k} \ge p_{2,j}2^{n^{\nicefrac {1}{4}}}} \frac{q_{1,k}}{p_{1,k}}\\&\le \sum _{k} \frac{q_{1,k}2^{-n^{1/4}}}{p_{2,j}} \\&\le \frac{2^{-n^{1/4}}}{p_{2,j}} \end{aligned} \end{aligned}$$

(4.30)

By the same reasoning we may bound the rank of $\sigma ^{(j,b)}_{{\bar{B}}}$ as

$$\begin{aligned} {{\,\textrm{rank}\,}}(\sigma ^{(j,b)}_{{\bar{B}}}) \le \frac{2^{-n^{1/4}}}{p_{1,j}}. \end{aligned}$$

(4.31)

Now, we will argue that the state $\sigma ^{(j,a)}$ has minimal cut $\gamma _{A,1}$, and $\sigma ^{(j,b)}$ has minimal cut $\gamma _{A,2}$, as in Fig. 5f. Intuitively, it is sensible that for $\sigma ^{(j,a)}$ the unique minimal cut is along $\gamma _{A,1}$, as for $\phi ^{(j,a)}$, we have a fixed maximally entangled state along $\gamma _{A,2}$, and a superposition of maximally entangled states of lower dimension along $\gamma _{A,1}$. Similarly, for $\sigma ^{(j,b)}$ the minimal cut is along $\gamma _{A,2}$. To confirm this intuition, we will now show that $\sigma ^{(j,a)}_B \approx \phi ^{(j,a)}_B$ and $\sigma ^{(j,b)}_{{\bar{B}}} \approx \phi ^{(j,b)}_{{\bar{B}}}$.

We show this for $\sigma ^{(j,a)}_B$. In this case, the “minimal” cut is simply B. Let $\Delta _B$ be a cut for B unequal to B or to $V_b' \cup B$. We denote by $\Delta _A$ the associated minimal cut for A on the original graph G given by $\Delta _A = \Delta _B \cup \Gamma _{A,1}$. We let

$$\begin{aligned} Y^{\Delta _B} = \{(e,x) : e = (xy), x \in \Delta _B^c, y \in B \}. \end{aligned}$$

Note that as $\phi $ is a product state we have $H_{\min }^{\varepsilon }(\Delta _B {\setminus } B \vert B)_{\phi } = H_{\min }^{\varepsilon }(\Delta _B {\setminus } B \vert B Y^{\Delta _B})_{\phi }$. We can obtain $\phi ^{(j,a)}$ from $\phi $ by acting with subunital CP maps on B and ${\bar{B}}$ and therefore (by two applications of data processing)

$$\begin{aligned} H_{\min }^{\varepsilon }(\Delta _B {\setminus } B \vert B Y^{\Delta _B})_{\phi ^{(j,a)}}&\ge H_{\min }^{\varepsilon }(\Delta _B {\setminus } B \vert B Y^{\Delta _B})_{\phi } = H_{\min }^{\varepsilon }(\Delta _B {\setminus } B \vert B)_{\phi } \\&\ge H_{\min }^{\varepsilon }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi } \end{aligned}$$

We now consider $\Delta _B = V_b' \cup B$ (in which case $Y^{\Delta _B}$ is empty, since we assume $\gamma _{A,1} \cap \gamma _{A,2} = \emptyset $). Then,

$$\begin{aligned} H_{\min }^{\varepsilon }(\Delta _B {\setminus } B \vert B)_{\phi ^{(j,a)}} \ge H_{\min }(\Delta _B {\setminus } B \vert B)_{\phi ^{(j,a)}} = H_{\min }({{\bar{B}}})_{\phi ^{(j,a)}} - H_{\max }(B)_{\phi ^{(j,a)}} \end{aligned}$$

using the product structure of $\phi ^{(j,a)}$. Now, we compute

$$\begin{aligned} H_{\min }({{\bar{B}}})_{\phi ^{(j,a)}} = - \log (\Vert \phi ^{(j,a)}_{{\bar{B}}}\Vert _{\infty }) \ge -\log (p_{2,j}), \end{aligned}$$

where we recall that $p_{2,j}$ is a binned eigenvalue. By Eq. (4.30), we can bound

$$\begin{aligned} H_{\max }(B)_{\phi ^{(j,a)}}&\le \log ({{\,\textrm{rank}\,}}(\phi ^{(j,a)}_B)) \le \log \left( \frac{2^{-n^{1/4}}}{p_{2,j}}\right) = -\log (p_{2,j}) - n^{\frac{1}{4}}. \end{aligned}$$

In conclusion,

$$\begin{aligned} H_{\min }^{\varepsilon }(\Delta _B {\setminus } B \vert B)_{\phi ^{(j,a)}} \ge n^{\frac{1}{4}}. \end{aligned}$$

By Lemma 4.11, we can find $\phi ^{(\varepsilon ,j,a)}$ such that $P(\phi ^{(j,a)}, \phi ^{(\varepsilon ,j,a)}) = {\mathcal {O}}(\sqrt{\varepsilon })$ and for all cuts $\Delta _B \ne B$ we have

$$\begin{aligned} H_{\min }(\Delta _B {\setminus } B \vert B)_{\phi ^{(\varepsilon ,j,a)}} \ge H_{\min }^{\varepsilon }(\Delta _B {\setminus } B \vert B Y^{\Delta _B})_{\phi ^{(j,a)}} \end{aligned}$$

and hence for $\Delta _B \ne V_b' \cup B$

$$\begin{aligned} H_{\min }(\Delta _B {\setminus } B \vert B)_{\phi ^{(\varepsilon ,j,a)}} \ge H_{\min }^{\varepsilon }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi } \end{aligned}$$

and for $\Delta _B = V_b' \cup B$

$$\begin{aligned} H_{\min }(\Delta _B {\setminus } B \vert B)_{\phi ^{(\varepsilon ,j,a)}} \ge n^{\frac{1}{4}}. \end{aligned}$$

Therefore, Proposition 4.1 allows us to conclude that if we let $\sigma ^{(\varepsilon ,j,a)}$ denote the random tensor network state with background state $\phi ^{(\varepsilon ,j,a)}$, then

$$\begin{aligned} \mathbbm {E}\Vert \sigma ^{(\varepsilon ,j,a)}_B - \phi ^{(\varepsilon ,j,a)}_B\Vert _1 = {\mathcal {O}}\left( 2^{-\frac{1}{2} K(n)} + 2^{-\frac{1}{2} n^{\frac{1}{4}}}\right) , \end{aligned}$$

and by Lemma 4.3 and the fact that $T(\phi ^{(j,a)}, \phi ^{(\varepsilon ,j,a)}) \le P(\phi ^{(j,a)}, \phi ^{(\varepsilon ,j,a)})$ this implies

$$\begin{aligned} \begin{aligned} \mathbbm {E}\Vert \sigma ^{(j,a)}_B - \phi ^{(j,a)}_B\Vert _1&\le \mathbbm {E}\Vert \sigma ^{(\varepsilon ,j,a)}_B - \phi ^{(\varepsilon ,j,a)}_B\Vert _1 \\&\quad + \mathbbm {E}\Vert \sigma ^{(\varepsilon ,j,a)}_B - \sigma ^{(j,a)}_B\Vert _1 +\Vert \phi ^{(\varepsilon ,j,a)}_B - \phi ^{(j,a)}_B\Vert _1 \\&= {\mathcal {O}}\left( 2^{-\frac{1}{2} K(n)} + 2^{-\frac{1}{2} n^{\frac{1}{4}}} + \sqrt{\varepsilon (n)}\right) . \end{aligned} \end{aligned}$$

It follows that

$$\begin{aligned} \mathbbm {E}\Vert \sigma ^{(a)}_B - \phi ^{(a)}_B\Vert _1&\le \sum _j \mathbbm {E}\Vert \sigma ^{(j,a)}_B - \phi ^{(j,a)}_B\Vert _1 \nonumber \\&= {\mathcal {O}}\left( n^{\alpha + \frac{1}{2}}\log (n)(2^{-\frac{1}{2} K(n)} + 2^{-\frac{1}{2} n^{\frac{1}{4}}} + \sqrt{\varepsilon (n)})\right) \rightarrow 0 \end{aligned}$$

(4.32)

using that the number of terms is ${\mathcal {O}}(n^{\alpha + \frac{1}{2}}\log (n))$. This expression goes to zero since we assume $K(n) = \Omega (\log (n)^2)$ and $\varepsilon (n) = {\mathcal {O}}(n^{-\gamma })$. A completely symmetric argument shows that

$$\begin{aligned} \mathbbm {E}\Vert \sigma ^{(j,b)}_{{{\bar{B}}}} - \phi ^{(j,b)}_{{{\bar{B}}}}\Vert _1 = {\mathcal {O}}\left( 2^{-\frac{1}{2} K(n)} + 2^{-\frac{1}{2} n^{\frac{1}{4}}} + \sqrt{\varepsilon (n)}\right) \end{aligned}$$

(4.33)

and hence

$$\begin{aligned} \begin{aligned} \mathbbm {E}\Vert \sigma ^{(b)}_{{\bar{B}}} - \phi ^{(b)}_{{\bar{B}}}\Vert _1 = {\mathcal {O}}(n^{\alpha + \frac{1}{2}}\log (n)(2^{-\frac{1}{2} K(n)} + 2^{-\frac{1}{2} n^{\frac{1}{4}}} + \varepsilon (n))) \rightarrow 0. \end{aligned} \end{aligned}$$

(4.34)

We have hence shown that $\sigma ^{(j,a)}_B \approx \phi _B^{(j,a)}$ and $\sigma ^{(j,b)}_{{{\bar{B}}}} \approx \phi _{{{\bar{B}}}}^{(j,b)}$ in expectation.

We now show that we can approximate the state ${\bar{\sigma }}_B$ on $G'$ by the sum of these states. We first claim that we can sum over the j index to obtain the $\sigma ^{(a)}_B$ and $\sigma ^{(b)}_B$. To see this, we introduce some notation: Given a self-adjoint matrix X, let us write ${{\,\textrm{supp}\,}}(X)$ for the image, or support, of X. In other words, ${{\,\textrm{supp}\,}}(X)$ is the space spanned by the eigenvectors of X with nonzero eigenvalue. We let ${\mathcal {H}}_{B,j}$ be the support of the reduced state of $\vert \Phi ^+_{1,j}\rangle $ on the B system, and we similarly let ${\mathcal {H}}_{{\bar{B}},j}$ be the support of the reduced state of $\vert \Phi ^+_{2,j}\rangle $ on the ${\bar{B}}$ system. Then for $j \ne k$, the subspace ${\mathcal {H}}_{B,j}$ is orthogonal to ${\mathcal {H}}_{B,k}$ and similarly ${\mathcal {H}}_{{\bar{B}},j}$ is orthogonal to ${\mathcal {H}}_{{\bar{B}},k}$. By construction, it is clear that ${{\,\textrm{supp}\,}}(\sigma ^{(j,a)}_{{\bar{B}}}) \subseteq {\mathcal {H}}_{{\bar{B}},j}$ and ${{\,\textrm{supp}\,}}(\sigma ^{(j,b)}_{B}) \subseteq {\mathcal {H}}_{B,j}$. This orthogonality for indices $j\ne k$ then makes it clear that we can sum the reduced states, as in Fig. 5g:

$$\begin{aligned} \sigma ^{(a)}_B = \sum _{j} \sigma ^{(j,a)}_B, \qquad \sigma ^{(b)}_{{\bar{B}}} = \sum _{j} \sigma ^{(j,b)}_{{\bar{B}}}, \end{aligned}$$

and by the purity of $\sigma ^{(b)}$, we have that $\sigma _B^{(b)}$ has the same spectrum as $\sigma _{{\bar{B}}}^{(b)}$, We remind the reader that the (a) and (b) indices indicate that the minimal cut for the state is given by $\gamma _{A,1}$ and $\gamma _{A,2}$, respectively. Naturally, this also holds for the summed states, by orthogonality of the summands.

We now claim that ${{\bar{\sigma }}}_B \approx \sigma ^{(a)}_B + \sigma ^{(b)}_B$, i.e., the spectrum of the approximate state on $G'$, as in Eq. (4.29), is well approximated by the sum of two states with differing minimal cuts. We estimate their difference by

$$\begin{aligned} \mathbbm {E}\Vert \sigma ^{(a)}_B + \sigma ^{(b)}_B - {{\bar{\sigma }}}_B\Vert _1&= \mathbbm {E}\Vert \sum _{j,k} {{\,\textrm{tr}\,}}_{{\bar{B}}}[\vert \sigma ^{(j,a)}\rangle \langle \sigma ^{(k,b)}\vert + \vert \sigma ^{(k,b)}\rangle \langle \sigma ^{(j,a)}\vert ]\Vert _1 \\&\le 2 \sum _{j,k}\mathbbm {E}\Vert {{\,\textrm{tr}\,}}_{{\bar{B}}}[\vert \sigma ^{(j,a)}\rangle \langle \sigma ^{(k,b)}\vert ]\Vert _1\\&= 2\sum _{j,k} \mathbbm {E}F({{\bar{\sigma }}}^{(j,a)}_{{\bar{B}}}, {{\bar{\sigma }}}^{(k,b)}_{{\bar{B}}}) \end{aligned}$$

by Lemma 4.9. Now, if $\rho $, $\sigma _1$ and $\sigma _2$ are positive operators, then by Lemma B.9 in [34], we have

$$\begin{aligned} \left|F(\rho ,\sigma _1) - F(\rho ,\sigma _2)\right| \le \sqrt{\Vert \sigma _1 - \sigma _2\Vert _1 {{\,\textrm{tr}\,}}[\rho ]}. \end{aligned}$$

So, we find that

$$\begin{aligned} \mathbbm {E}F(\sigma ^{(j,a)}_{{\bar{B}}}, \sigma ^{(k,b)}_{{\bar{B}}})&\le \mathbbm {E}\left|F(\sigma ^{(j,a)}_{{\bar{B}}}, \sigma ^{(k,b)}_{{\bar{B}}}) - F(\sigma ^{(j,a)}_{{\bar{B}}}, \phi ^{(k,b)}_{{\bar{B}}})\right| + \mathbbm {E}F(\sigma ^{(j,a)}_{{\bar{B}}}, \phi ^{(k,b)}_{{\bar{B}}}))\\&\le \mathbbm {E}\sqrt{{{\,\textrm{tr}\,}}[\sigma ^{(j,a)}_{{\bar{B}}}] \Vert \sigma ^{(k,b)}_{{\bar{B}}} - \phi ^{(k,b)}_{{\bar{B}}})\Vert _1} + \mathbbm {E}F(\sigma ^{(j,a)}_{{\bar{B}}}, \phi ^{(k,b)}_{{\bar{B}}})\\&\le \sqrt{\mathbbm {E}{{\,\textrm{tr}\,}}[\sigma ^{(j,a)}_{{\bar{B}}}]} \sqrt{\mathbbm {E}\Vert \sigma ^{(k,b)}_{{\bar{B}}} - \phi ^{(k,b)}_{{\bar{B}}})\Vert _1} + \mathbbm {E}F(\sigma ^{(j,a)}_{{\bar{B}}}, \phi ^{(k,b)}_{{\bar{B}}}), \end{aligned}$$

where we used Cauchy–Schwarz in the last step. For the first term, we may use Eq. (4.33) and $\mathbbm {E}{{\,\textrm{tr}\,}}[{{\bar{\sigma }}}^{(j,a)}_{{\bar{B}}}] \le 1$ to see

$$\begin{aligned} \sqrt{\mathbbm {E}{{\,\textrm{tr}\,}}[\sigma ^{(j,a)}_{{\bar{B}}}]} \sqrt{\mathbbm {E}\Vert \sigma ^{(k,b)}_{{\bar{B}}} - \phi ^{(k,b)}_{{\bar{B}}})\Vert _1} = {\mathcal {O}}(2^{-\frac{1}{4} K(n)} + 2^{-\frac{1}{4} n^{\frac{1}{4}}} + (\varepsilon (n))^{\frac{1}{4}}). \end{aligned}$$

For the second term we use a basic estimate on the fidelity: If $\rho $ and $\sigma $ are positive operators, then

$$\begin{aligned} F(\rho ,\sigma )&= \Vert \sqrt{\rho }\sqrt{\sigma }\Vert _1 \le \Vert \sqrt{\rho }\Vert _1 \Vert \sqrt{\sigma }\Vert _{\infty } \\&\le \sqrt{{{\,\textrm{rank}\,}}(\rho )} \Vert \sqrt{\rho }\Vert _2 \Vert \sqrt{\sigma }\Vert _{\infty } = \sqrt{{{\,\textrm{rank}\,}}(\rho ){{\,\textrm{tr}\,}}[\rho ] \Vert \sigma \Vert _{\infty }}, \end{aligned}$$

which follows by Hölder’s inequality and the standard relation between Schatten 1 and 2-norms. Then, the second term can be estimated as follows: Write $P_{{\bar{B}},j}$ for the projection onto ${\mathcal {H}}_{{\bar{B}}, j}$, then

$$\begin{aligned} F(\sigma ^{(j,a)}_{{\bar{B}}}, \phi ^{(k,b)}_{{\bar{B}}})&= F(\sigma ^{(j,a)}_{{\bar{B}}}, P_{{\bar{B}},j}\phi ^{(k,b)}_{{\bar{B}}}P_{{\bar{B}},j}) \\&\le \sqrt{{{\,\textrm{tr}\,}}[\sigma ^{(j,a)}_{{\bar{B}}}] {{\,\textrm{rank}\,}}(\sigma ^{(j,a)}_{{\bar{B}}}) \Vert P_{{\bar{B}},j}\phi ^{(k,b)}_{{\bar{B}}}P_{{\bar{B}},j}\Vert _{\infty }} \\&\le \sqrt{{{\,\textrm{tr}\,}}[\sigma ^{(j,a)}_{{\bar{B}}}] \frac{2^{-n^{1/4}}}{p_{2,j}} p_{2,j}} \\ {}&\le \sqrt{{{\,\textrm{tr}\,}}[\sigma ^{(j,a)}_{{\bar{B}}}] 2^{-n^{1/4}}} \end{aligned}$$

using Eq. (4.30) and $\Vert P_{{\bar{B}},j}\phi ^{(k,b)}_{{\bar{B}}}P_{{\bar{B}},j}\Vert _{\infty } \le p_{2,j}$. Thus

$$\begin{aligned} \mathbbm {E}F(\sigma ^{(j,a)}_{{\bar{B}}}, \phi ^{(k,b)}_{{\bar{B}}}) \le \sqrt{ \mathbbm {E}{{\,\textrm{tr}\,}}[\sigma ^{(j,a)}_{{\bar{B}}}] 2^{-n^{1/4}}} \le 2^{-\frac{1}{2} n^{1/4}} \end{aligned}$$

and we may estimate

$$\begin{aligned} \mathbbm {E}F(\sigma ^{(j,a)}_{{\bar{B}}}, \sigma ^{(k,b)}_{{\bar{B}}}) = {\mathcal {O}}(2^{-\frac{1}{4} K(n)} + 2^{-\frac{1}{4} n^{\frac{1}{4}}} + (\varepsilon (n))^{\frac{1}{4}}). \end{aligned}$$

and therefore

$$\begin{aligned} \begin{aligned} \mathbbm {E}\Vert \sigma ^{(a)}_B + \sigma ^{(b)}_B - {{\bar{\sigma }}}_B\Vert _1&\le 2\sum _{j,k} \mathbbm {E}F({{\bar{\sigma }}}^{(j,a)}_{{\bar{B}}}, {{\bar{\sigma }}}^{(k,b)}_{{\bar{B}}})\\&= {\mathcal {O}}(n^{2\alpha + 1} \log (n)^2(2^{-\frac{1}{4} K(n)} + 2^{-\frac{1}{4} n^{\frac{1}{4}}} + (\varepsilon (n))^{\frac{1}{4}})) \rightarrow 0 \end{aligned} \end{aligned}$$

(4.35)

using that the number of terms is ${\mathcal {O}}(n^{2\alpha + 1}\log (n)^2)$, $K(n) = \Omega (\log (n)^2)$ and $\varepsilon (n) = {\mathcal {O}}(n^{-\gamma })$, and our choice of $\alpha $ is such that ${\mathcal {O}}(n^{2\alpha + 1} \log (n)^2 n^{-\frac{1}{4} \gamma })$ goes to zero.

To summarize, we have shown that we can approximate the random tensor network state ${{\bar{\sigma }}}_B$ on $G'$ by $\sigma ^{(a)}_B + \sigma ^{(b)}_B$, and we can approximate $\sigma ^{(a)}_B$ by $\phi ^{(a)}_B$. Moreover, we can approximate $\sigma ^{(b)}_{{\bar{B}}}$ by $\phi ^{(b)}_{{\bar{B}}}$, so the spectrum of $\sigma ^{(b)}_B$ can be approximated by the spectrum of $\phi ^{(b)}_{{\bar{B}}}$. Recall that $\sigma ^{(a)}$ and $\sigma ^{(b)}$ are the random tensor network states on $G'$ which take $\phi ^{(a)}$ and $\phi ^{(b)}$ as background states, respectively. Furthermore, recall that our larger goal is to show that the spectra of the reduced background states $\phi _B^{(a)}$ and $\phi _B^{(b)}$ are, in some sense, close to the spectrum of the approximate random tensor network state $\sigma _B$ on the subgraph $G'$. This is a two-step process:

1.
Show that $\phi ^{(a)}_B$ and $\sigma _B^{(a)}$ can be slightly deformed to $\phi ^{(a,\perp )}_B$ and $\sigma _B^{(a,\perp )}$, states with support orthogonal to ${{\,\textrm{supp}\,}}(\sigma _B^{(j,b)})$ for all j.
2.
Use the slightly deformed states to show that ${{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B\oplus \phi ^{(b)}_B)$ is close to ${{\,\textrm{spec}\,}}_+(\sigma ^{(a)}_B+\sigma ^{(b)}_B)$, and follow the chain of approximations: $\sigma ^{(a)}_B+\sigma ^{(b)}_B \approx {{\bar{\sigma }}}_B \approx \sigma _B$ along with ${{\,\textrm{spec}\,}}_+(\sigma _B) \approx {{\,\textrm{spec}\,}}_+(\rho _A)$ to conclude that ${{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B\oplus \phi ^{(b)}_B) \approx {{\,\textrm{spec}\,}}_+(\rho _A)$

Now, consider the state $\sigma ^{(j,b)}_B$. It has support contained in ${\mathcal {H}}_{B,j}$ and by Eq. (4.31), it has ${{\,\textrm{rank}\,}}(\sigma ^{(j,b)}_B) \le 2^{-n^{1/4}} p_{1,j}^{-1}$. Let $Q_j$ be the projection onto ${{\,\textrm{supp}\,}}(\sigma ^{(j,b)}_B)$, so ${{\,\textrm{supp}\,}}(Q_j) \subseteq {\mathcal {H}}_{B,j}$ and ${{\,\textrm{rank}\,}}(Q_j) \le 2^{-n^{1/4}} p_{1,j}^{-1}$. We find that

$$\begin{aligned} \Vert Q_j \phi ^{(a)}_B\Vert _1 \le {{\,\textrm{rank}\,}}(Q_j) \Vert P_{B,j} \phi ^{(a)}_B P_{B,j}\Vert _{\infty } \le \frac{2^{-n^{1/4}}}{p_{1,j}} p_{1,j} = 2^{-n^{1/4}}. \end{aligned}$$

If we let $Q = \sum _j Q_j$, then we see that $(I - Q)\phi ^{(a)}(I - Q)$ is a small deformation of $\phi _B^{(a)}$:

$$\begin{aligned} \Vert \phi ^{(a)}_B - (I - Q)\phi ^{(a)}(I - Q)\Vert _1&\le 2\Vert Q \phi ^{(a)}\Vert _1 \le 2\sum _j \Vert Q_j \phi ^{(a)}_B\Vert _1\nonumber \\&= {\mathcal {O}}(n^{\nicefrac 92}\log (n)2^{-n^{1/4}}) \rightarrow 0. \end{aligned}$$

(4.36)

With this property in hand, we can show that $\sigma ^{(a)}_B$ can similarly be deformed. Let $\sigma ^{(a,\perp )}_B = (I - Q)\sigma ^{(a)}(I - Q)$. By construction, $\sigma ^{(a,\perp )}_B$ has support orthogonal to that of $\sigma ^{(b)}_B$, and hence, ${{\,\textrm{spec}\,}}_+(\sigma ^{(a,\perp )}_B + \sigma ^{(b)}_B) = {{\,\textrm{spec}\,}}_+(\sigma ^{(a,\perp )}_B \oplus \sigma ^{(b)}_{{\bar{B}}})$. We observe that $\sigma ^{(a,\perp )}_B$ is a small deformation of $\sigma ^{(a)}_B$:

$$\begin{aligned} \begin{aligned} \mathbbm {E}\Vert \sigma _{B}^{(a)} - \sigma ^{(a,\perp )}_B\Vert _1&\le \mathbbm {E}\Vert \sigma _{B}^{(a)} - \phi _{B}^{(a)}\Vert _1\\&\quad + \mathbbm {E}\Vert (I- Q)\sigma _{B}^{(a)}(I- Q) - (I- Q)\phi _{B}^{(a)} (I - Q)\Vert _1\\&\quad + \Vert \phi ^{(a)}_B - (I - Q)\phi ^{(a)}(I - Q)\Vert _1\\&\le 2\mathbbm {E}\Vert \sigma _{B}^{(a)} - \phi _{B}^{(a)}\Vert _1 + \Vert \phi ^{(a)}_B - (I - Q)\phi ^{(a)}(I - Q)\Vert _1 \end{aligned} \end{aligned}$$

(4.37)

which goes to zero, using Eq. (4.32) for the first term, and Eq. (4.36) for the second term. By construction, $(I - Q)\phi ^{(a)}(I - Q)$ has support orthogonal to ${{\,\textrm{supp}\,}}(\sigma ^{(b)}_{B})$. This allows us to bound

$$\begin{aligned}&\mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B \oplus \phi ^{(a)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\sigma ^{(a)}_B + \sigma ^{(b)}_B)\Vert _1\\&\quad \le \mathbbm {E}\Vert \sigma _{B}^{(a)} - \sigma ^{(a,\perp )}_B\Vert _1 + \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B \oplus \phi ^{(a)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\sigma ^{(a,\perp )}_B \oplus \sigma ^{(b)}_B)\Vert _1 \\&\quad = \mathbbm {E}\Vert \sigma _{B}^{(a)} - \sigma ^{(a,\perp )}_B\Vert _1 + \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B)\\ {}&\qquad - {{\,\textrm{spec}\,}}_+(\sigma ^{(a,\perp )}_B) + {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\sigma ^{(b)}_B)\Vert _1 \\&\quad \le \mathbbm {E}\Vert \sigma _{B}^{(a)} - \sigma ^{(a,\perp )}_B\Vert _1 + \mathbbm {E}\Vert \phi ^{(a)}_B - \sigma ^{(a,\perp )}_B\Vert _1 + \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\sigma ^{(b)}_B)\Vert _1, \end{aligned}$$

where we have used the bound on the difference of spectra by the trace distance of the corresponding states Eq. (1.5). Each term on the RHS can be shown to converge to zero. The first term goes to zero by Eq. (4.37). Similarly, the second term can be bounded by

$$\begin{aligned} \mathbbm {E}\Vert \phi ^{(a)}_B - \sigma ^{(a,\perp )}_B\Vert _1 \le \mathbbm {E}\Vert \phi ^{(a)}_B - \sigma ^{(a)}_B\Vert _1 + \mathbbm {E}\Vert \sigma ^{(a)}_B - \sigma ^{(a,\perp )}_B\Vert _1 \end{aligned}$$

which goes to zero by Eq. (4.37) and Eq. (4.32). The third term can be estimated by observing that ${{\,\textrm{spec}\,}}_+(\sigma ^{(b)}_B) = {{\,\textrm{spec}\,}}_+(\sigma ^{(b)}_{{\bar{B}}})$ and

$$\begin{aligned} \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\sigma ^{(b)}_{B})\Vert _1 \le \mathbbm {E}\Vert \phi ^{(a)}_{{\bar{B}}} - \sigma ^{(b)}_{{\bar{B}}}\Vert _1 \end{aligned}$$

which goes to zero by Eq. (4.34). We conclude that $\mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B \oplus \phi ^{(a)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\sigma ^{(a)}_B + \sigma ^{(b)}_B)\Vert _1 \rightarrow 0$.

We now follow a chain of approximations to get the desired closeness between ${{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B \oplus \phi ^{(a)}_{{\bar{B}}})$ and ${{\,\textrm{spec}\,}}_+(\rho _A)$. Using Eqs. (4.29) and (4.35), we can see that

$$\begin{aligned} \mathbbm {E}\Vert \sigma _B - (\sigma ^{(a)}_B + \sigma ^{(b)}_B)\Vert _1 \le \mathbbm {E}\Vert \sigma _B - {{\bar{\sigma }}}_B\Vert _1 + \mathbbm {E}\Vert \sigma ^{(a)}_B + \sigma ^{(b)}_B - {{\bar{\sigma }}}_B\Vert _1 \rightarrow 0, \end{aligned}$$

so $\mathbbm {E}\Vert \sigma _B - (\sigma ^{(a)}_B + \sigma ^{(b)}_B)\Vert _1$ will converge to 0. This then implies

$$\begin{aligned}&\mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\sigma _B) - {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B \oplus \phi ^{(a)}_{{\bar{B}}})\Vert _1 \\&\quad \le \mathbbm {E}\Vert \sigma _B - (\sigma ^{(a)}_B + \sigma ^{(b)}_B)\Vert _1 \\&\qquad + \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B \oplus \phi ^{(a)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\sigma ^{(a)}_B + \sigma ^{(b)}_B)\Vert _1, \end{aligned}$$

and as we have just shown, both terms on the RHS converge to zero. Together with Eq. (4.23), this yields the desired relationship to the spectrum of $\rho $:

$$\begin{aligned} \begin{aligned}&\mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B \oplus \phi ^{(b)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\rho _A)\Vert _1 \\&\quad \le \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\phi ^{(a)}_B \oplus \phi ^{(b)}_{{\bar{B}}}) - {{\,\textrm{spec}\,}}_+(\sigma _{B})\Vert _1\\&\qquad + \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\sigma _{B})\Vert _1 \rightarrow 0. \end{aligned} \end{aligned}$$

(4.38)

We pause here again to note that we have accomplished Step 3: approximating the spectrum of $\rho _A$ by the spectra of the (approximate) background states. The final step will then be to consider the convergence properties of the distributions on the background states, which will then translate to convergence for the distribution on $\rho _A$.

Explicitly, we want to relate the above result to $\textrm{min}_*(\nu _1,\nu _2)$. First, we observe that by Lemma 4.8 and Eq. (4.26), $\nu _{{{\tilde{\phi }}}_B} \Rightarrow \nu _1$ and $\nu _{{{\tilde{\phi }}}_{{\bar{B}}}} \Rightarrow \nu _2$, and since $\min $ is a continuous function, convergence holds for the pushforward measure:

$$\begin{aligned} \textrm{min}_*(\nu _{{{\tilde{\phi }}}_{B}},\nu _{{{\tilde{\phi }}}_{{\bar{B}}}}) \Rightarrow \textrm{min}_*(\nu _1,\nu _2). \end{aligned}$$

(4.39)

We compute

$$\begin{aligned} \begin{aligned} \textrm{min}_*(\nu _{{{\tilde{\phi }}}_{B}},\nu _{{{\tilde{\phi }}}_{{\bar{B}}}})&= \sum _{j,k} q_{1,k}q_{2,j}\delta _{\frac{1}{\sqrt{n}}[\min (\log (\frac{1}{p_{1,k}}),\log (\frac{1}{p_{2,j}})) - H(n)]}\\&= \sum _{j}\left( \sum _{p_{1,k} \ge 2^{n^{1/4}}p_{2,j}} q_{1,k}q_{2,j}\delta _{\frac{1}{\sqrt{n}}[\log (\frac{1}{p_{1,k}}) - H(n)]}\right. \\&\quad \left. + \sum _{p_{1,k} \le 2^{-n^{\nicefrac {1}{4}}}p_{2,j}} q_{1,k}q_{2,j}\delta _{\frac{1}{\sqrt{n}}[\log (\frac{1}{p_{2,j}}) - H(n)]} \right) \\&\quad + \sum _{2^{-n^{\nicefrac {1}{4}}}p_{2,j} \le p_{1,k} \le 2^{n^{\nicefrac {1}{4}}}p_{2,j}} q_{1,k}q_{2,j}\delta _{\frac{1}{\sqrt{n}}[\min (\log (\frac{1}{p_{1,k}}),\log (\frac{1}{p_{2,j}})) - H(n)]}\\&= \sum _{k} q_{1,k}q_{2,>k}\delta _{\frac{1}{\sqrt{n}}[\log (\frac{1}{p_{1,k}}) - H(n)]}\\&\quad + \sum _{j} q_{1,> j}q_{2,j}\delta _{\frac{1}{\sqrt{n}}[\log (\frac{1}{p_{2,j}}) - H(n)]} + \nu _0 \end{aligned} \end{aligned}$$

(4.40)

where we have written

$$\begin{aligned} q_{1,>j}&= \sum _{p_{1,k} \ge 2^{n^{1/4}}p_{2,j}} q_{1,k}\\ q_{2,>k}&= \sum _{p_{1,k} \le 2^{-n^{\nicefrac {1}{4}}}p_{2,j}} q_{2,j} \end{aligned}$$

and

$$\begin{aligned} \nu _0 = \sum _{2^{-n^{\nicefrac {1}{4}}}p_{2,j} \le p_{1,k} \le 2^{n^{\nicefrac {1}{4}}}p_{2,j}} q_{1,k}q_{2,j}\delta _{\frac{1}{\sqrt{n}}[\min (\log (\frac{1}{p_{1,k}}),\log (\frac{1}{p_{2,j}})) - H(n)]}. \end{aligned}$$

Then by Eq. (4.28), we see that $\nu _0 \Rightarrow 0$. Using Eq. (4.39), we conclude

$$\begin{aligned} \textrm{min}_*(\nu _{{{\tilde{\phi }}}_{B}},\nu _{{{\tilde{\phi }}}_{{\bar{B}}}}) - \nu _0 \Rightarrow \textrm{min}_*(\nu _1,\nu _2). \end{aligned}$$

(4.41)

On the other hand, by construction, we have

$$\begin{aligned} \nu _{{{\,\textrm{spec}\,}}(\phi ^{(a)}_B \oplus \phi ^{(b)}_{{\bar{B}}})} = \nu _{\phi ^{(a)}_B} + \nu _{\phi ^{(b)}_{{\bar{B}}}}. \end{aligned}$$

We can explicitly write down

$$\begin{aligned} \nu _{\phi ^{(a)}_B} + \nu _{\phi ^{(b)}_{{\bar{B}}}}&= \sum _k q_{1,k}q_{2,>k}\delta _{\frac{1}{\sqrt{n}}[\log (\frac{1}{p_{1,k}q_{2,>k}}) - H(n)]} \\&\quad + \sum _{j} q_{1,> k}q_{2,j}\delta _{\frac{1}{\sqrt{n}}[\log (\frac{1}{p_{2,j}q_{1,> k}}) - H(n)]}. \end{aligned}$$

Since the $q_{1,k}$ and $q_{2,j}$ are at least $n^{-\beta }$ (if the corresponding term is nonzero), $\left|\log (q_{1,>j})\right| \le \beta \log (n)$ and $\left|\log (q_{2,>k})\right| \le \beta \log (n)$. With these bounds in mind, we can compare to the last line of Eq. (4.40). More precisely, if we let $\nu '_{{{\bar{\phi }}}} = \textrm{min}_*(\nu _{{{\tilde{\phi }}}_{B}},\nu _{{{\tilde{\phi }}}_{{\bar{B}}}}) - \nu _0$, then for uniformly continuous $f \in C_b(\mathbbm {R})$, we have

$$\begin{aligned} \lim _{n \rightarrow \infty } \left|\int f(x) \textrm{d}\nu '_{{{\bar{\phi }}}}(x) - \int f(x) \textrm{d}\nu _{\phi ^{(a)}_B}(x) - \int f(x) \textrm{d}\nu _{\phi ^{(b)}_{{\bar{B}}}}(x)\right| = 0 \end{aligned}$$

We conclude that

$$\begin{aligned} \nu _{\phi ^{(a)}_B} + \nu _{\phi ^{(b)}_{{\bar{B}}}} \Rightarrow \textrm{min}_*(\nu _1,\nu _2). \end{aligned}$$

(4.42)

Now, we can finally put all of our ingredients together. Recall the statement of convergence in probability implied by the convergence of spectra, in expectation, as in the second part of Lemma 4.8. Then using Eq. (4.38) as the vectors $p^{(n)}$ and $q^{(n)}$ in the statement of Lemma 4.8 and the convergence of distributions in Eq. (4.42), we conclude

$$\begin{aligned} \nu _{\rho _A} \Rightarrow \textrm{min}_*(\nu _1,\nu _2) \end{aligned}$$

(4.43)

$\square $

4.5 Computing Entropies with Two Minimal Cuts

Ideally, we would like to use Theorem 4.12 to compute von Neumann entropies of random tensor network states. However, Theorem 4.12 alone is too weak to allow us to directly compute entropies up to $o(\sqrt{n})$ corrections, as weak convergence of the spectrum does not directly imply convergence of the mean. However, we kept track of various approximation errors in the proof of Theorem 4.12, and we will use these to show that with slightly stronger assumptions, these errors allow to compute the entropy up to ${\mathcal {O}}(\log (n))$ corrections.

To begin, we first bound the difference in the entropy of a sum of density matrices and the sum of the entropies of the individual density matrices:

Lemma 1.19

Suppose $\{p_i\}_{i=1}^m$ is a subnormalized distribution and $\{\rho _i\in {\mathcal {P}}({\mathcal {H}}) \}_{i=1}^m$ is a collection of (unnormalized) density matrices and let $\rho = \sum _i p_i \rho _i \in {\mathcal {P}}({\mathcal {H}})$. Suppose that $C^{-1} \le {{\,\textrm{tr}\,}}[\rho _i] \le C$, and $C^{-1} \le {{\,\textrm{tr}\,}}[\rho ] \le C$, then

$$\begin{aligned} \left|H(\rho ) - \sum _{i=1}^m p_i H(\rho _i)\right| \le C(\log (m) + 2\log (C)). \end{aligned}$$

Proof

If the $\rho _i$ are normalized and $\sum _i p_i = 1$, then by the Holevo bound we have

$$\begin{aligned} \sum _i p_i H(\rho _i) \le H\bigg (\sum _i p_i \rho _i\bigg ) \le \sum _i p_i H(\rho _i) + H(\{p_i\}) \le \sum _i p_i H(\rho _i) + \log (m). \end{aligned}$$

Now let $q_i = {{\,\textrm{tr}\,}}[\rho _i]$, $P = \sum _i {q_i p_i}$ and let $\sigma _i = \frac{\rho _i}{q_i}$, $r_i = \frac{p_i q_i}{P}$. Then, $H(\sum _i p_i \rho _i) = H(P\sum _i r_i \sigma _i) = P H(\sum _i r_i \sigma _i) - P\log (P)$. On the other hand

$$\begin{aligned} P\sum _i r_i H(\sigma _i) = \sum _i p_i H(\rho _i) + \sum _i p_i\log (q_i) \end{aligned}$$

so it follows that

$$\begin{aligned} \left|H\bigg (\sum _i p_i \rho _i\bigg ) - \sum _i p_iH(\rho _i)\right|&\le P\log (m) + \left|P\log (P)\right| + \max \left|\log (q_i)\right|\\&\le C(\log (m) + 2\log (C)). \end{aligned}$$

$\square $

The idea is that for a random tensor network state, we will split up the background state as a superposition of states which are maximally entangled along the two minimal cuts and then use Lemma 4.13. This approach formalizes an argument sketched in [6].

To state our result, we introduce a new function: For vectors of positive numbers $p \in \mathbbm {R}^{d_1}$, $q \in \mathbbm {R}^{d_2}$, we let

$$\begin{aligned} H^*(p,q) := \sum _{i,j} p_i q_j \min \left(\log \left( \frac{1}{p_i}\right) ,\log \left( \frac{1}{q_j}\right) \right). \end{aligned}$$

A key tool we will need is the continuity of the entropy: If $\rho , \sigma \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$ are quantum states on a Hilbert space ${\mathcal {H}}$ of dimension d with $T(\rho ,\sigma ) \le \frac{1}{e}$, then the Fannes–Audenaert inequality states that

$$\begin{aligned} \left|H(\rho ) - H(\sigma )\right| \le T(\rho ,\sigma )\log d - T(\rho ,\sigma )\log (T(\rho ,\sigma )). \end{aligned}$$

(4.44)

Let $\eta $ be the function defined by $\eta (x) = x\log x$. Then, if we write ${{\,\textrm{spec}\,}}(\rho ) = \{p_j\}_{j=1}^d$ and ${{\,\textrm{spec}\,}}(\sigma ) = \{q_j\}_{j=1}^d$ for Eq. (4.44) one can actually show

$$\begin{aligned} \left|H(\rho ) - H(\sigma )\right| \le \sum _{j=1}^d \, \left|\eta (p_j) - \eta (q_j)\right| \le T(\rho ,\sigma )\log d - T(\rho ,\sigma )\log (T(\rho ,\sigma )). \end{aligned}$$

We will use this to show continuity of $H^*$ as well. Consider $\rho _i, \sigma _i \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}}_i)$ with $\dim ({\mathcal {H}}_i) \le d$ for $i = 1,2$. We let ${{\,\textrm{spec}\,}}(\rho _i) = \{p_{i,j}\}_{j=1}^{d_i}$ and ${{\,\textrm{spec}\,}}(\sigma _i) = \{q_{i,j}\}_{j=1}^{d_i}$. For real numbers $x_i, y_i$ we have

$$\begin{aligned} \left|\min (x_1,y_1) - \min (x_2,y_2)\right| \le \left|x_1 - x_2\right| + \left|y_1 - y_2\right|. \end{aligned}$$

Then, we see that

$$\begin{aligned}&\left|H^*({{\,\textrm{spec}\,}}(\rho _1),{{\,\textrm{spec}\,}}(\rho _2)) - H^*({{\,\textrm{spec}\,}}(\sigma _1),{{\,\textrm{spec}\,}}(\sigma _2))\right| \\&\quad \le \sum _{j,k} \, \left| p_{1,j} p_{2,k} \min \left( \log \left(\frac{1}{p_{1,j}}\right) ,\log \left(\frac{1}{p_{2,k}}\right)\right) \right. \\&\qquad \left. - q_{1,j} q_{2,k} \min \left( \log \left(\frac{1}{q_{1,j}}\right), \log \left(\frac{1}{q_{2,k}}\right)\right) \right| \\&\quad \le \sum _{j,k} \, \left| p_{1,j} p_{2,k} \log p_{1,j} - q_{1,j} q_{2,k} \log q_{1,j}\right| + \left| p_{1,j} p_{2,k} \log p_{2,k} - q_{1,j} q_{2,k} \log q_{2,k}\right|. \end{aligned}$$

We may estimate the first term by

$$\begin{aligned}&\sum _{j,k} \, \left| p_{1,j} p_{2,k} \log p_{1,j} - q_{1,j} q_{2,k} \log q_{1,j}\right| \\&\quad \le \sum _{j,k} p_{2,k}\left| p_{1,j} \log p_{1,j} - q_{1,j} \log q_{1,j}\right| - \sum _{j,k} \, \left| p_{2,k} - q_{2,k}\right| \, q_{1,j} \log q_{1,j}\\&\quad \le \sum _j \, \left|\eta (p_{1,j}) - \eta (p_{2,j})\right| + \Vert \rho _2 - \sigma _2\Vert _1 H(\sigma _1) \\&\quad \le T(\rho _1,\sigma _1)\log d - T(\rho _1,\sigma _1)\log (T(\rho _1,\sigma _1)) + 2T(\rho _2,\sigma _2)\log d \end{aligned}$$

The second term may be estimated in similar fashion. We conclude that if $T(\rho _1, \sigma _1)\le \varepsilon \le e^{-1}$ and $T(\rho _2, \sigma _2)\le \varepsilon \le e^{-1}$

$$\begin{aligned} \left|H^*({{\,\textrm{spec}\,}}(\rho _1),{{\,\textrm{spec}\,}}(\rho _2)) - H^*({{\,\textrm{spec}\,}}(\sigma _1),{{\,\textrm{spec}\,}}(\sigma _2))\right| \le 6\varepsilon \log d - 2\varepsilon \log (\varepsilon ). \end{aligned}$$

(4.45)

We will now show that we can approximate the entropy as would be expected from Theorem 4.12, if we make some additional assumptions (which in particular are satisfied if the state along each edge is a copy of n states, $\phi _e = \phi _{e,0}^{\otimes n}$).

Corollary 1.20

Let $\rho $ be a random tensor network state satisfying the same assumptions as in Theorem 4.12, and assume additionally that the $\nu _{\phi _{\gamma _{A,i}}}$ have uniformly exponentially decaying tail probabilities and $\varepsilon (n) = {\mathcal {O}}(n^{-\gamma })$ for $\gamma > 10$. Assume that for each edge $e = (xy)$, the bond dimension is $D_e = 2^{{\mathcal {O}}(n)}$ and ${{\,\textrm{tr}\,}}[\phi _{e,x}^2] \le 2^{-\Omega (n)}$. Then with high probability

$$\begin{aligned} \left|H(\rho _A) - H^*({{\,\textrm{spec}\,}}(\phi _{\gamma _{A,1}}),{{\,\textrm{spec}\,}}(\phi _{\gamma _{A,2}}))\right| = {\mathcal {O}}(\log (n)). \end{aligned}$$

Proof

The basic proof strategy will be that of Theorem 4.12: We study a slightly reduced problem on the approximated tensor network states ${\tilde{\sigma }}$ with approximate background states ${\tilde{\phi }}$, work out the entropies for ${\tilde{\sigma }}$ and ${\tilde{\phi }}$, and then argue that the closeness of the resulting entropies will continue to hold for the original tensor network state and background state, up to errors that we carefully keep track of.

First of all, we note that we can reduce to the tensor network state $\sigma $ on the reduced graph $G'$, with error as in Eq. (4.23); in particular

$$\begin{aligned} \mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\sigma _{B})\Vert _1 = {\mathcal {O}}(n^{-\frac{\gamma }{2}}). \end{aligned}$$

(4.46)

Next we adapt the part of the proof of Theorem 4.12 where we modify the state along the minimal cuts. In the proof of Theorem 4.12, we first observed that if the (regularized) spectrum along each cut $\nu _{\phi _{\Gamma _{A,i}}}$ has uniformly exponentially decaying tail probabilities, then for sufficiently large C, we can slice off the tails with vanishing probability mass:

$$\begin{aligned} \sum _{\lambda _{i,j} \notin I_n} \lambda _{i,j} = \nu _{\phi _{\Gamma _{A,i}}}((-\infty ,-C\log (n))\cup (C\log (n),\infty )) = O\left( \frac{1}{n^4}\right) . \end{aligned}$$

We also binned the eigenvalues of $\phi $ in the reduced spectrum:

$$\begin{aligned} q_{i,j} = d_{i,j}p_{i,j}, \end{aligned}$$

where $p_{i,j}$ are the bins and $d_{i,j}$ is the multiplicity of each bin. We then removed bins that were too small, leading to a state ${{\tilde{\phi }}}$. In the notation of the proof of Theorem 4.12 we choose $\alpha $ and $\beta $ such that $\alpha > 2$ and $\frac{5}{2} + \alpha < \beta \le \frac{1}{2}(\gamma - 1)$. By Eq. (4.26) this yields an error

$$\begin{aligned} \Vert {{\tilde{\phi }}} - \phi \Vert _1 = o(n^{-1}). \end{aligned}$$

(4.47)

Note that in the proof of Theorem 4.12, we performed one more approximation of removing the “middle” or “diagonal” part of the spectrum and obtained a state ${\bar{\phi }}$; we will not need to do this here. Now, we recall that the binning of eigenvalues allows us to write ${\tilde{\phi }}$ as a superposition over maximally entangled states along each cut:

$$\begin{aligned} \vert {{\tilde{\phi }}}_{\gamma _{A,1}, \gamma _{A,2}}\rangle&= \sum _{j,k} \sqrt{q_{1,j} q_{2,k}} \vert \Phi ^+_{1,j}\rangle \otimes \vert \Phi ^+_{2,k}\rangle , \end{aligned}$$

and use this to decompose ${{\tilde{\phi }}}$ as

$$\begin{aligned} \vert {{\tilde{\phi }}}\rangle&= \sum _{j,k} \sqrt{q_{1,j} q_{2,k}} \vert \psi ^{(j,k)}\rangle , \end{aligned}$$

where in $\vert \psi ^{(j,k)}\rangle $, we have replaced $\vert {{\tilde{\phi }}}_{\gamma _{A,1}, \gamma _{A,2}}\rangle $ with the maximally entangled state $\vert \Phi ^+_{1,j}\rangle \otimes \vert \Phi ^+_{2,k}\rangle $. The states $\vert \psi ^{(j,k)}\rangle $ are normalized background states on the graph $G'$. We let

$$\begin{aligned} \vert \phi ^{(j,k)}\rangle = \sqrt{q_{1,j} q_{2,k}}\vert \psi ^{(j,k)}\rangle \end{aligned}$$

which are subnormalized states. At this point, the key idea of the argument is straightforward. We will consider the random tensor network states which have (a smoothed version) of the background states $\vert \psi ^{(j,k)}\rangle $. These will have entropy close to $\min (\log (d_{1,j}), \log (d_{2,k}))$. Then, we will use Lemma 4.13 to argue that the entropy of $\rho _A$ is approximated up to ${\mathcal {O}}(\log (n))$ terms by the convex combination $\sum _{j,k} q_{1,j} q_{2,k} \min (\log (d_{1,j}), \log (d_{2,k}))$, which we can relate to the desired result. To make this easy argument precise, we will need to take care of smoothing the background state appropriately and ensure that the relevant states are close to normalized.

We will now argue that we can smoothen the states $\phi ^{(j,k)}$. We may assume without loss of generality that the states $\phi ^{(j,k)}$ have nonnegative coefficients in the standard basis $\vert I\rangle = \prod _{e \in E'} \vert i_e i_e\rangle $ where $I = \{i_e\}_{e \in E'}$ runs over all possible basis elements over each edge, so we can write

$$\begin{aligned} \vert \phi ^{(j,k)}\rangle = \sum _I \sqrt{\lambda ^{(j,k)}_I}. \end{aligned}$$

By the same argument as in Theorem 4.12 we find a state

$$\begin{aligned} \phi ^{(j,k,\varepsilon )} = \sum _I \sqrt{\lambda ^{(j,k,\varepsilon )}_I} \end{aligned}$$

which is such that for any $\Delta _B \in C(B)$ not equal to B or $V_b' \cup B$ and $\Delta _A = \Gamma _{A,1} \cup \Delta _B$

$$\begin{aligned} H_{\min }(\Delta _B {\setminus } B \vert B)_{\phi ^{(j,k,\varepsilon )}} \ge H^\varepsilon _{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi }, \end{aligned}$$

while for $\Delta _B = B$ and $\Delta _B = B \cup V_b'$ we have

$$\begin{aligned} H_{\min }(B)_{\phi ^{(j,k,\varepsilon )}}&\ge H_{\min }(B)_{\phi ^{(j,k)}} = \log \frac{1}{q_{2,k} p_{1,j}}\\ H_{\min }({{\bar{B}}})_{\phi ^{(j,k,\varepsilon )}}&\ge H_{\min }({{\bar{B}}})_{\phi ^{(j,k)}} = \log \frac{1}{q_{1,j}p_{2,k}}. \end{aligned}$$

Moreover, $\phi ^{(j,k,\varepsilon )}$ is close to $\phi ^{(j,k)}$ in the sense that

$$\begin{aligned} \Vert \phi ^{(j,k,\varepsilon )} - \phi ^{(j,k)}\Vert _1&= {\mathcal {O}}\left(\sqrt{\sum _I \left|\lambda ^{(j,k,\varepsilon )} - \lambda ^{(j,k)}\right|}\right) = {\mathcal {O}}(\sqrt{\varepsilon }). \end{aligned}$$

By the remark after Lemma 4.11 we may assume that $\lambda ^{(j,k,\varepsilon )} \le \lambda ^{(j,k)}$.

By a chain rule (e.g., Theorem 5.13 in [74], proven in [30]) for any $\Delta _B \in C(B)$ it holds that

$$\begin{aligned} H_2(\Delta _B)_{\phi ^{(j,k,\varepsilon )}} \ge H_{\min }(\Delta _B {\setminus } B \vert B)_{\phi ^{(j,k,\varepsilon )}} + H_{\min }(B)_{\phi ^{(j,k,\varepsilon )}}. \end{aligned}$$

We now let

$$\begin{aligned} \vert \phi ^{\varepsilon }\rangle = \sum _{j,k} \vert \phi ^{(j,k,\varepsilon )}\rangle . \end{aligned}$$

Then, by Lemma C.4

$$\begin{aligned} \Vert {{\tilde{\phi }}} - \phi ^{\varepsilon }\Vert _1&\le \sqrt{2 \sum _{j,k} \sum _I \left|\lambda _I^{(j,k)} - \lambda _I^{(j,k,\varepsilon )}\right|} = {\mathcal {O}}(\sqrt{n^{2\alpha }\log (n)^2\varepsilon (n)}) \nonumber \\&= {\mathcal {O}}(n^{\alpha - \frac{1}{2}\gamma }\log (n)) = {\mathcal {O}}(n^{-3}). \end{aligned}$$

(4.48)

Therefore, if we let $\sigma ^{\varepsilon }_B$ be the random tensor network state with background state $\phi ^{\varepsilon }$

$$\begin{aligned} \mathbbm {E}\Vert \sigma - \sigma ^{\varepsilon }\Vert _1 \le \Vert \phi - {{\tilde{\phi }}}\Vert _1 + \Vert {{\tilde{\phi }}} - \phi ^{\varepsilon }\Vert _1 = o\bigg (\frac{1}{n}\bigg ). \end{aligned}$$

(4.49)

Finally, let

$$\begin{aligned} \vert \phi ^{(j,k,\varepsilon )}\rangle = \sqrt{q_{1,j} q_{2,k}}\vert \psi ^{(j,k,\varepsilon )}\rangle \end{aligned}$$

By Eq. (C.2) in the remark after the proof of Lemma 4.11 we have

$$\begin{aligned} \left|{{\,\textrm{tr}\,}}[\psi ^{(j,k,\varepsilon )}] - 1\right|&= \frac{1}{q_{1,j}q_{2,k}}\left|{{\,\textrm{tr}\,}}[\phi ^{(j,k,\varepsilon )}] - {{\,\textrm{tr}\,}}[\phi ^{(j,k)}]\right| \\&= {\mathcal {O}}\left( \frac{\varepsilon }{q_{1,j}q_{2,k}}\right) = {\mathcal {O}}(n^{2\beta - \gamma }) = {\mathcal {O}}(n^{-1}). \end{aligned}$$

using that $q_{1,j}, q_{2,k} > n^{-\beta }$ and $\beta \le \frac{1}{2}(\gamma - 1)$. We denote the random tensor network states with background states $\psi ^{(j,k,\varepsilon )}$ by $\sigma ^{(j,k)}$, and the random tensor network states with background states $\sum _{j} \sqrt{q_{1,j}}\psi ^{(j,k,\varepsilon )}$ by $\sigma ^{(k)}$.

We introduce the event N, which entails that $\rho , \sigma $ and $\sigma ^{(j,k)}$ for all j, k are close to normalized, that is

$$\begin{aligned} \left|{{\,\textrm{tr}\,}}[\rho ] - 1\right| \le \frac{1}{n^2} \ \text { and } \ \left|{{\,\textrm{tr}\,}}[\sigma ] - 1\right| \le \frac{1}{n^2} \ \text { and } \ \left|{{\,\textrm{tr}\,}}[\sigma ^{(j,k)}] - {{\,\textrm{tr}\,}}[\psi ^{(j,k,\varepsilon )}]\right| \le \frac{1}{n^2}. \end{aligned}$$

By assumption, for each edge $e = (xy)$ we have ${{\,\textrm{tr}\,}}[\phi _{e,x}^2] \le 2^{-\Omega (n)}$. Moreover, using that $q_{1,j}, q_{2,k} \ge n^{-\beta }$

$$\begin{aligned} {{\,\textrm{tr}\,}}[(\phi ^{(j,k,\varepsilon )}_{\Delta })^2] \le q_{1,j}^{-2} q_{2,k}^{-2} {{\,\textrm{tr}\,}}[\phi ^2_{\Delta }] = {\mathcal {O}}({{\,\textrm{poly}\,}}(n)2^{-\Omega (n)}) \end{aligned}$$

and therefore the quantity $\eta $ in Lemma 2.1 is ${\mathcal {O}}(2^{-\Omega (n)})$ for $\rho $, $\sigma $ and $\sigma ^{(j,k)}$. So, by Lemma 2.1 and the union bound the event N has probability

$$\begin{aligned} p_N&\ge 1 - \left( \Pr \left( \left|{{\,\textrm{tr}\,}}[\rho ] - 1\right| \ge \frac{1}{n^2}\right) + \Pr \left( \left|{{\,\textrm{tr}\,}}[\sigma ] - 1\right| \ge \frac{1}{n^2}\right) \right. \\&\left. \quad + \sum _{j,k} \Pr \left( \left|{{\,\textrm{tr}\,}}[\sigma ^{(j,k)}] - {{\,\textrm{tr}\,}}[\psi ^{(j,k,\varepsilon )}]\right| \ge \frac{1}{n^2}\right) \right) \\&= 1 - {\mathcal {O}}({{\,\textrm{poly}\,}}(n)2^{-\Omega (n)}) = 1 - {\mathcal {O}}(2^{-\Omega (n)}). \end{aligned}$$

We denote by $\mathbbm {E}_N$ the expectation value over the random tensors conditioned on this event.

We now use Lemma 4.13 to approximate the entropy of $\sigma ^{\varepsilon }_B$ conditioned on N:

$$\begin{aligned} \left|H(\sigma ^{\varepsilon }_B) - \sum _k q_{2,k} H(\sigma _B^{(k)})\right| = {\mathcal {O}}(\log (n)). \end{aligned}$$

Because $\sigma ^{(k)}$ is pure, we have $H(\sigma _B^{(k)}) = H(\sigma _{{{\bar{B}}}}^{(k)})$. We apply Lemma 4.13 again, this time to the decomposition of $\sigma _{{{\bar{B}}}}^{(k)}$ to see:

$$\begin{aligned} \left|H(\sigma _{{{\bar{B}}}}^{(k)}) - \sum _j q_{1,j} H(\sigma _{{{\bar{B}}}}^{(j,k)})\right| = {\mathcal {O}}(\log (n)). \end{aligned}$$

Thus,

$$\begin{aligned} \left|H(\sigma ^{\varepsilon }_B) - \sum _{j,k} q_{1,j} q_{2,k} H(\sigma _{B}^{(j,k)})\right| = {\mathcal {O}}(\log (n)). \end{aligned}$$

Since ${{\,\textrm{rank}\,}}(\sigma _{B}^{(j,k)}) \le \min \{d_{1,j}, d_{2,k} \}$ (and taking into account the normalization of $\sigma ^{(j,k)}$)

$$\begin{aligned} H(\sigma _{B}^{(j,k)}) \le {{\,\textrm{tr}\,}}[\sigma ^{(j,k)}]\min \left( \log d_{1,j}, \log d_{2,k} \right) - {{\,\textrm{tr}\,}}[\sigma ^{(j,k)}] \log {{\,\textrm{tr}\,}}[\sigma ^{(j,k)}]. \end{aligned}$$

Conditioned on N and using $\left|{{\,\textrm{tr}\,}}[\psi ^{(j,k,\varepsilon )}] - 1\right| = {\mathcal {O}}(n^{-1})$, this implies

$$\begin{aligned} H(\sigma _{B}^{(j,k)}) \le \min \left( \log d_{1,j}, \log d_{2,k} \right) + {\mathcal {O}}(1). \end{aligned}$$

For a lower bound, we use that

$$\begin{aligned} \mathbbm {E}_N H(\sigma _{B}^{(j,k)})&\ge \mathbbm {E}_N {{\,\textrm{tr}\,}}[\sigma ^{(j,k)}]\left( -\log {{\,\textrm{tr}\,}}\left[(\sigma _{B}^{(j,k)})^2\right] + \log {{\,\textrm{tr}\,}}[\sigma _{B}^{(j,k)}]\right) \\&\ge \bigg ({{\,\textrm{tr}\,}}[\psi ^{(j,k,\varepsilon )}] - \frac{1}{n^2}\bigg )\left( -\log \mathbbm {E}_N {{\,\textrm{tr}\,}}\left[ (\sigma _{B}^{(j,k)})^2\right] + \log \bigg ({{\,\textrm{tr}\,}}[\psi ^{(j,k,\varepsilon )}] - \frac{1}{n^2}\bigg ) \right) \\&\ge -\log \mathbbm {E}_N {{\,\textrm{tr}\,}}\left[ (\sigma _{B}^{(j,k)})^2\right] - {\mathcal {O}}(1) \end{aligned}$$

where in the first and second inequality we have used Jensen’s inequality, and in the second and third inequality we have used ${{\,\textrm{tr}\,}}[\sigma _{B}^{(j,k)}] \ge {{\,\textrm{tr}\,}}[\psi ^{(j,k,\varepsilon )}] - \frac{1}{n^2} \ge 1 - {\mathcal {O}}(\frac{1}{n})$. We use the replica trick to estimate

$$\begin{aligned} \mathbbm {E}{{\,\textrm{tr}\,}}\left[ (\sigma ^{(j,k)}_B)^2 \right] = \sum _{\Delta _B \in C(B)} {{\,\textrm{tr}\,}}[(\psi ^{(j,k,\varepsilon )}_{\Delta _B})^2] \end{aligned}$$

In this expression, we have contributions from $\Delta _B = B$ and $\Delta _B = B \cup V_b'$, which yield contributions

$$\begin{aligned} {{\,\textrm{tr}\,}}[(\psi ^{(j,k,\varepsilon )}_{B})^2] \le d_{1,j}^{-1} \\ {{\,\textrm{tr}\,}}[(\psi ^{(j,k,\varepsilon )}_{{{\bar{B}}}})^2] \le d_{2,k}^{-1}. \end{aligned}$$

For any other cut $\Delta _B$, we have a contribution at most

$$\begin{aligned} {{\,\textrm{tr}\,}}[(\psi ^{(j,k,\varepsilon )}_{\Delta _B})^2]&= q_{1,j}^{-2}q_{2,k}^{-2} {{\,\textrm{tr}\,}}[\phi ^{(j,k,\varepsilon )}] 2^{-H_{\min }(B)_{\phi ^{(j,k,\varepsilon )}} - H^\varepsilon _{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi }}\\&\le q_{1,j}^{-1}q_{2,k}^{-1}2^{-H_{\min }(B)_{\phi ^{(j,k,\varepsilon )}}} \le d_{1,j}^{-1} \end{aligned}$$

using that ${{\,\textrm{tr}\,}}[\phi ^{(j,k,\varepsilon )}] \le q_{1,j}q_{2,k}$ and $2^{-H_{\min }(B)_{\phi ^{(j,k,\varepsilon )}}} \le q_{2,k}p_{1,j} = q_{1,j}q_{2,k} d_{1,j}^{-1}$. Therefore

$$\begin{aligned} \mathbbm {E}_N {{\,\textrm{tr}\,}}\left[ (\sigma ^{(j,k,\varepsilon )}_B)^2 \right]&\le p_N^{-1} \mathbbm {E}{{\,\textrm{tr}\,}}\left[ (\sigma ^{(j,k,\varepsilon )}_B)^2 \right] \\&= d_{1,j}^{-1} + d_{2,k}^{-1} + {\mathcal {O}}(d_{1,j}^{-1}). \end{aligned}$$

$$\begin{aligned} -\log \mathbbm {E}_N {{\,\textrm{tr}\,}}\left[ (\sigma ^{(j,k,\varepsilon )}_B)^2 \right]&\ge -\log \left( \max \left(d_{1,j}^{-1}, d_{2,k}^{-1}\right)(2 + {\mathcal {O}}(1)) \right) \\&\ge \min ( \log d_{1,j}, \log d_{2,k} ) - {\mathcal {O}}(1). \end{aligned}$$

We find that

$$\begin{aligned} \mathbbm {E}_N \left|H(\sigma _{B}^{(j,k)}) - \min ( \log d_{1,j}, \log d_{2,k} )\right| = {\mathcal {O}}(1) \end{aligned}$$

and hence

$$\begin{aligned} \mathbbm {E}_N \left|\sum _{j,k} q_{1,j} q_{2,k} H(\sigma _{B}^{(j,k)}) - \sum _{j,k} q_{1,j} q_{2,k} \min ( \log d_{1,j}, \log d_{2,k} )\right| = {\mathcal {O}}(1). \end{aligned}$$

(4.50)

We collect the various estimates we have found:

$$\begin{aligned}&\mathbbm {E}_N \Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\sigma _{B})\Vert _1 \le p_N^{-1}\mathbbm {E}\Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\sigma _{B})\Vert _1 = o\left( \frac{1}{n}\right) \\&\mathbbm {E}_N \Vert \sigma _B - \sigma ^{\varepsilon }_B\Vert _1 \le p_N^{-1}\mathbbm {E}\Vert \sigma _B - \sigma ^{\varepsilon }_B\Vert _1 = o\left( \frac{1}{n}\right) \\&\mathbbm {E}_N \left|\sum _{j,k} q_{1,j} q_{2,k} H(\sigma _{B}^{(j,k)}) - \sum _{j,k} q_{1,j} q_{2,k} \min ( \log d_{1,j}, \log d_{2,k} )\right| = {\mathcal {O}}(1). \end{aligned}$$

Let M be the event that N holds and moreover

$$\begin{aligned}&\Vert {{\,\textrm{spec}\,}}_+(\rho _A) - {{\,\textrm{spec}\,}}_+(\sigma _{B})\Vert _1 \le \frac{1}{n}\\&\Vert \sigma _B - \sigma ^{\varepsilon }_B\Vert _1 \le \frac{1}{n} \\&\left|\sum _{j,k} q_{1,j} q_{2,k} H(\sigma _{B}^{(j,k)}) - \sum _{j,k} q_{1,j} q_{2,k} \min ( \log d_{1,j}, \log d_{2,k} )\right| \le \log (n). \end{aligned}$$

By Markov’s inequality and the union bound, the probability that M holds goes to one as n goes to infinity. Moreover, if M holds, it is easy to verify that by the Fannes–Audenaert inequality Eq. (4.44) and the fact that $\rho $, $\sigma $ and $\sigma ^{\varepsilon }$ are close to normalized

$$\begin{aligned} \left|H\Bigg (\frac{\rho _A}{{{\,\textrm{tr}\,}}[\rho ]}\Bigg ) - H\Bigg (\sigma ^{\varepsilon }_B\Bigg )\right|&= {\mathcal {O}}(1). \end{aligned}$$

Also, if M holds, by Eq. (4.50)

$$\begin{aligned}&\left|H(\sigma ^{\varepsilon }_B) - \sum _{j,k} q_{1,j} q_{2,k} \min ( \log d_{1,j}, \log d_{2,k} )\right|\\&\quad = \left|\sum _{j,k} q_{1,j} q_{2,k} H(\sigma _{B}^{(j,k)}) - \sum _{j,k} q_{1,j} q_{2,k} \min ( \log d_{1,j}, \log d_{2,k} )\right| \\&\qquad +{\mathcal {O}}(\log (n))= {\mathcal {O}}(\log (n)) \end{aligned}$$

so we conclude that

$$\begin{aligned} \left|H\bigg (\frac{\rho _A}{{{\,\textrm{tr}\,}}[\rho ]}\bigg ) - \sum _{j,k} q_{1,j} q_{2,k} \min ( \log d_{1,j}, \log d_{2,k} )\right|&= {\mathcal {O}}(\log (n)). \end{aligned}$$

Since $q_{i,j} = d_{i,j}p_{i,j} \ge n^{-\beta }$ from the binning procedure, we have the simple observation

$$\begin{aligned} \frac{1}{n^{\beta } p_{i,j}} \le \frac{q_{i,j}}{p_{i,j}} = d_{i,j} \le \frac{1}{p_{i,j}}, \end{aligned}$$

and hence $\left|\min (\log d_{1,j},\log d_{2,k}) - \log (\min (\frac{1}{p_{1,k}},\frac{1}{p_{2,j}}))\right| = {\mathcal {O}}(\log (n))$. We conclude that

$$\begin{aligned} \left|H\left( \frac{\rho _A}{{{\,\textrm{tr}\,}}[\rho ]}\right) - \sum _{j,k} q_{1,k}q_{2,j}\log \left( \min \left(\frac{1}{p_{1,k}},\frac{1}{p_{2,j}}\right)\right) \right| = {\mathcal {O}}(\log (n)). \end{aligned}$$

Finally, we need to relate the result back to the original background state. In the above approximation to $H({\tilde{\sigma }}_B)$, we see that

$$\begin{aligned} \sum _{j,k} q_{1,k}q_{2,j}\log \left( \min \left(\frac{1}{p_{1,k}},\frac{1}{p_{2,j}}\right)\right) = H^*({{\,\textrm{spec}\,}}({{\tilde{\phi }}}_{\gamma _{A,1}}),{{\,\textrm{spec}\,}}({{\tilde{\phi }}}_{\gamma _{A,2}})). \end{aligned}$$

Then by Eqs. (4.47) and (4.45), this will converge to the appropriate quantity on the non-approximated background state:

$$\begin{aligned} \left|H^*({{\,\textrm{spec}\,}}({{\tilde{\phi }}}_{\gamma _{A,1}}),{{\,\textrm{spec}\,}}({{\tilde{\phi }}}_{\gamma _{A,2}})) - H^*({{\,\textrm{spec}\,}}(\phi _{\gamma _{A,1}}),{{\,\textrm{spec}\,}}(\phi _{\gamma _{A,2}}))\right| \rightarrow 0, \end{aligned}$$

proving the desired result. $\square $

Notes

Here, we are requiring any putative theory of quantum gravity to (a) be defined non-perturbatively and (b) have strong evidence for the existence of a semiclassical limit consisting of Einstein gravity coupled to quantum field theory.
A basic example of a distribution which does not have compact support, but is nevertheless uniquely determined by its moments, is a standard Gaussian distribution. On the other hand, a standard example of distributions that are not determined by their moments is the densities on $\mathbbm {R}_{\ge 0}$ with $\textrm{d}\mu _1(x) = \sqrt{2\pi }x^{-1}e^{-(\log x)^2/2} \textrm{d}x$ and $\textrm{d}\mu _2(x) = (1+\sin (2\pi \log x)) \textrm{d}\mu _1(x)$, for which it can be verified that the nth moments of both distributions are equal to $e^{-n^2/2}$, while the distributions are clearly not identical.
The function $f(\lambda ) = \sqrt{\lambda }$ is not in $C_b(\mathbbm {R})$, but the method of moments actually shows a stronger convergence, allowing test functions to have polynomial growth.
Note that while $H(A\vert B)_\phi = H(AB)_\phi - H(B)_\phi $, in general $H_2(A\vert B)_{\phi \vert \phi } \ne H_2(AB)_{\phi } - H_2(B)_{\phi }$.
In fact, the protocol in [26] is slightly more general than what we describe; rather than measuring a random state one could also measure a random projection of rank greater than 1. This can be used to obtain EPR pairs between the helpers and Alice and Bob to get nonzero $L_{A,i}$ and $L_{B,i}$.

References

Almheiri, A., Engelhardt, N., Marolf, D., Maxfield, H.: The entropy of bulk quantum fields and the entanglement wedge of an evaporating black hole. J. High Energy Phys. 2019(12), 1–47 (2019)
MathSciNet Google Scholar
Akers, C., Faulkner, T., Lin, S., Rath, P.: Reflected entropy in random tensor networks. arXiv preprint arXiv:2112.09122 (2021)
Anderson, G.W., Guionnet, A., Zeitouni, O.: An Introduction to Random Matrices. Cambridge University Press, Cambridge (2010)
Google Scholar
Almheiri, A., Hartman, T., Maldacena, J., Shaghoulian, E., Tajdini, A.: Replica wormholes and the entropy of Hawking radiation. J. High Energy Phys. 2020(5), 1–42 (2020)
MathSciNet Google Scholar
Aubrun, G., Nechita, I.: Realigning random states. J. Math. Phys. 53(10), 102210 (2012)
ADS MathSciNet Google Scholar
Akers, C., Penington, G.: Leading order corrections to the quantum extremal surface prescription. arXiv preprint arXiv:2008.03319 (2020)
Akers, C., Penington, G.: Quantum minimal surfaces from quantum error correction (2022)
Akers, C., Rath, P.: Holographic Rényi entropy from quantum error correction. J. High Energy Phys. 2019(5), 1–24 (2019)
Google Scholar
Aubrun, G., Szarek, S.J.: Alice and Bob Meet Banach, vol. 223. American Mathematical Society, Providence (2017)
Google Scholar
Aubrun, G., Szarek, S.J., Ye, D.: Phase transitions for random states and a semicircle law for the partial transpose. Phys. Rev. A 85(3), 030302 (2012)
ADS Google Scholar
Aubrun, G.: Partial transposition of random states and non-centered semicircular distributions. Random Matrices Theory Appl. 1(02), 1250001 (2012)
MathSciNet Google Scholar
Banica, T., Belinschi, S.T., Capitaine, M., Collins, B.: Free Bessel laws. Can. J. Math. 63(1), 3–37 (2011)
MathSciNet Google Scholar
Billingsley, P.: Probability and Measure. Wiley, New York (2008)
Google Scholar
Bao, N., Penington, G., Sorce, J., Wall, A.C.: Beyond toy models: distilling tensor networks in full AdS/CFT. J. High Energy Phys. 2019(11), 1–63 (2019)
MathSciNet Google Scholar
Bai, Z., Silverstein, J.W.: Spectral Analysis of Large Dimensional Random Matrices, vol. 20. Springer, New York (2010)
Google Scholar
Bousso, R., Wildenhain, E.: Gravity/ensemble duality. Phys. Rev. D 102(6), 066005 (2020)
ADS MathSciNet Google Scholar
Christandl, M., Doran, B., Kousidis, S., Walter, M.: Eigenvalue distributions of reduced density matrices. Commun. Math. Phys. 332(1), 1–52 (2014)
ADS MathSciNet Google Scholar
Collins, B., González-Guillén, C.E., Pérez-García, D.: Matrix product states, random matrix theory and the principle of maximum entropy. Commun. Math. Phys. 320(3), 663–677 (2013)
ADS MathSciNet Google Scholar
Czech, B., Hayden, P., Lashkari, N., Swingle, B.: The information theoretic interpretation of the length of a curve. J. High Energy Phys. 2015(6), 1–40 (2015)
MathSciNet Google Scholar
Collins, B., Nechita, I.: Random matrix techniques in quantum information theory. J. Math. Phys. 57(1), 015215 (2016)
ADS MathSciNet Google Scholar
Collins, B., Nechita, I., Życzkowski, K.: Random graph states, maximal flow and Fuss–Catalan distributions. J. Phys. A Math. Theor. 43(27), 275–303 (2010)
MathSciNet Google Scholar
Collins, B., Nechita, I., Życzkowski, K.: Area law for random graph states. J. Phys. A Math. Theor. 46(30), 305302 (2013)
ADS MathSciNet Google Scholar
Cirac, J.I., Perez-Garcia, D., Schuch, N., Verstraete, F.: Matrix product states and projected entangled pair states: concepts, symmetries, theorems. Rev. Mod. Phys. 93(4), 045003 (2021)
ADS MathSciNet Google Scholar
Dupuis, F., Berta, M., Wullschleger, J., Renner, R.: One-shot decoupling. Commun. Math. Phys. 328(1), 251–284 (2014)
ADS MathSciNet Google Scholar
Drescher, L., Fawzi, O.: On simultaneous min-entropy smoothing. In: 2013 IEEE International Symposium on Information Theory, pp. 161–165. IEEE (2013)
Dutil, N., Hayden, P.: One-shot multiparty state merging. arXiv preprint arXiv:1011.1974 (2010)
Dong, X., Harlow, D., Marolf, D.: Flat entanglement spectra in fixed-area states of quantum gravity. J. High Energy Phys. 2019(10), 1–25 (2019)
MathSciNet Google Scholar
Dong, X., McBride, S., Weng, W.W.: Replica wormholes and holographic entanglement negativity. arXiv preprint arXiv:2110.11947 (2021)
Dong, X., Qi, X.-L., Walter, M.: Holographic entanglement negativity and replica symmetry breaking. arXiv preprint arXiv:2101.11029 (2021)
Dupuis, F.: Chain rules for quantum Rényi entropies. J. Math. Phys. 56(2), 022203 (2015)
ADS MathSciNet Google Scholar
Dutil, N.: Multiparty quantum protocols for assisted entanglement distillation. arXiv preprint arXiv:1105.4657 (2011)
Engelhardt, N., Wall, A.C.: Quantum extremal surfaces: holographic entanglement entropy beyond the classical regime. J. High Energy Phys. 2015(1), 1–27 (2015)
Google Scholar
Faulkner, T., Lewkowycz, A., Maldacena, J.: Quantum corrections to holographic entanglement entropy. J. High Energy Phys. 2013(11), 1–18 (2013)
Google Scholar
Fawzi, O., Renner, R.: Quantum conditional mutual information and approximate Markov chains. Commun. Math. Phys. 340(2), 575–611 (2015)
ADS MathSciNet Google Scholar
Gross, D., Audenaert, K., Eisert, J.: Evenly distributed unitaries: on the structure of unitary designs. J. Math. Phys. 48(5), 052104 (2007)
ADS MathSciNet Google Scholar
Harlow, D.: The Ryu–Takayanagi formula from quantum error correction. Commun. Math. Phys. 354(3), 865–912 (2017)
ADS MathSciNet Google Scholar
Hastings, M.B.: The asymptotics of quantum max-flow min-cut. Commun. Math. Phys. 351(1), 387–418 (2017)
ADS MathSciNet Google Scholar
Hayashi, M.: Second-order asymptotics in fixed-length source coding and intrinsic randomness. IEEE Trans. Inf. Theory 54(10), 4619–4637 (2008)
MathSciNet Google Scholar
Hayden, P., Leung, D.W., Winter, A.: Aspects of generic entanglement. Commun. Math. Phys. 265(1), 95–117 (2006)
ADS MathSciNet Google Scholar
Hayden, P., Nezami, S., Qi, X.-L., Thomas, N., Walter, M., Yang, Z.: Holographic duality from random tensor networks. J. High Energy Phys. 2016(11), 1–56 (2016)
MathSciNet Google Scholar
Hubeny, V.E., Rangamani, M., Takayanagi, T.: A covariant holographic entanglement entropy proposal. J. High Energy Phys. 2007(07), 062 (2007)
ADS MathSciNet Google Scholar
Kudler-Flam, J., Narovlansky, V., Ryu, S.: Negativity spectra in random tensor networks and holography. arXiv preprint arXiv:2109.02649 (2021)
Klappenecker, A., Rotteler, M.: Mutually unbiased bases are complex projective 2-designs. In: Proceedings of the International Symposium on Information Theory, 2005. ISIT 2005, pp. 1740–1744. IEEE (2005)
Levy, R., Clark, B.K.: Entanglement entropy transitions with random tensor networks. arXiv preprint arXiv:2108.02225 (2021)
Lewkowycz, A., Maldacena, J.: Generalized gravitational entropy. J. High Energy Phys. 2013(8), 1–29 (2013)
MathSciNet Google Scholar
Lancien, C., Pérez-García, D.: Correlation length in random MPS and PEPS. Ann. Henri Poincaré, 1–82 (2021)
Lopez-Piqueres, J., Ware, B., Vasseur, R.: Mean-field entanglement transitions in random tree tensor networks. Phys. Rev. B 102(6), 064202 (2020)
ADS Google Scholar
Li, Y., Vasseur, R., Fisher, M., Ludwig, A.W.W.: Statistical mechanics model for Clifford random tensor networks and monitored quantum circuits. arXiv preprint arXiv:2110.02988 (2021)
Maldacena, J.: The large-$N$ limit of superconformal field theories and supergravity. Int. J. Theor. Phys. 38(4), 1113–1133 (1999)
MathSciNet Google Scholar
Morgan, E., Brandão, F.G.S.L.: A classical model correspondence for $G$-symmetric random tensor networks. J. Phys. Commun. 5, 105016 (2021)
Google Scholar
Marolf, D., Maxfield, H.: Transcending the ensemble: baby universes, spacetime wormholes, and the order and disorder of black hole information. J. High Energy Phys. 2020(8), 1–72 (2020)
MathSciNet Google Scholar
Mingo, J.A., Speicher, R.: Free Probability and Random Matrices, vol. 35. Springer, New York (2017)
Google Scholar
Medina, R., Vasseur, R., Serbyn, M.: Entanglement transitions from restricted Boltzmann machines. Phys. Rev. B 104(10), 104205 (2021)
ADS Google Scholar
Marolf, D., Wang, S., Wang, Z.: Probing phase transitions of holographic entanglement entropy with fixed area states. J. High Energy Phys. 2020(12), 1–41 (2020)
MathSciNet Google Scholar
Nahum, A., Roy, S., Skinner, B., Ruhman, J.: Measurement and entanglement phase transitions in all-to-all quantum circuits, on quantum trees, and in Landau-Ginsburg theory. PRX Quantum 2(1), 010352 (2021)
Google Scholar
Nica, A., Speicher, R.: Lectures on the Combinatorics of Free Probability, vol. 13. Cambridge University Press, Cambridge (2006)
Google Scholar
Nezami, S., Walter, M.: Multipartite entanglement in stabilizer tensor networks. Phys. Rev. Lett. 125(24), 241602 (2020)
ADS MathSciNet Google Scholar
Potters, M., Bouchaud, J.-P.: A First Course in Random Matrix Theory: For Physicists, Engineers and Data Scientists. Cambridge University Press, Cambridge (2020)
Google Scholar
Penington, G.: Entanglement wedge reconstruction and the information paradox. J. High Energy Phys. 2020(9), 1–84 (2020)
MathSciNet Google Scholar
Penington, G., Shenker, S.H., Stanford, D., Yang, Z.: Replica wormholes and the black hole interior. arXiv preprint arXiv:1911.11977 (2019)
Pastawski, F., Yoshida, B., Harlow, D., Preskill, J.: Holographic quantum error-correcting codes: Toy models for the bulk/boundary correspondence. J. High Energy Phys. 2015(6), 1–55 (2015)
MathSciNet Google Scholar
Qi, X.-L., Shangnan, Z., Yang, Z.: Holevo information and ensemble theory of gravity. arXiv preprint arXiv:2111.05355 (2021)
Qi, X.-L., Yang, Z.: Space-time random tensor networks and holographic duality. arXiv preprint arXiv:1801.05289 (2018)
Qi, X.-L., Yang, Z., You, Y.-Z.: Holographic coherent states from random tensor networks. J. High Energy Phys. 2017(8), 1–29 (2017)
MathSciNet Google Scholar
Ryu, S., Takayanagi, T.: Aspects of holographic entanglement entropy. J. High Energy Phys. 2006(08), 045 (2006)
MathSciNet Google Scholar
Ryu, S., Takayanagi, T.: Holographic derivation of entanglement entropy from the anti-de Sitter space/conformal field theory correspondence. Phys. Rev. Lett. 96(18), 181602 (2006)
ADS MathSciNet Google Scholar
Sárosi, G.: $\text{AdS}_2$ holography and the SYK model. arXiv preprint arXiv:1711.08482 (2017)
Saad, P., Shenker, S.H., Stanford, D.: JT gravity as a matrix integral. arXiv preprint arXiv:1903.11115 (2019)
Saad, P., Shenker, S.H., Stanford, D., Yao, S.: Wormholes without averaging. arXiv preprint arXiv:2103.16754 (2021)
Swingle, B.: Constructing holographic spacetimes using entanglement renormalization. arXiv preprint arXiv:1209.3304 (2012)
Swingle, B.: Entanglement renormalization and holography. Phys. Rev. D 86(6), 065007 (2012)
ADS Google Scholar
Tomamichel, M., Hayashi, M.: A hierarchy of information quantities for finite block length analysis of quantum tasks. IEEE Trans. Inf. Theory 59(11), 7693–7710 (2013)
MathSciNet Google Scholar
Tomamichel, M.: A framework for non-asymptotic quantum information theory. Ph.D. thesis, ETH Zurich (2012)
Tomamichel, M.: Quantum Information Processing with Finite Resources: Mathematical Foundations, vol. 5. Springer, New York (2015)
Google Scholar
Verstraete, F., Cirac, J.I.: Valence-bond states for quantum computation. Phys. Rev. A 70(6), 060302 (2004)
ADS Google Scholar
Verstraete, F., Murg, V., Cirac, J.I.: Matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems. Adv. Phys. 57(2), 143–224 (2008)
ADS Google Scholar
Vasseur, R., Potter, A.C., You, Y.-Z., Ludwig, A.W.W.: Entanglement transitions from holographic random tensor networks. Phys. Rev. B 100(13), 134203 (2019)
ADS Google Scholar
Wang, J.: On the information-theoretic aspects of black holes. Ph.D. thesis, ETH Zurich (2022)
Walter, M., Witteveen, F.: Hypergraph min-cuts from quantum entropies. J. Math. Phys. 62(9), 092203 (2021)
ADS MathSciNet Google Scholar
Yang, Z., Hayden, P., Qi, X.-L.: Bidirectional holographic codes and sub-AdS locality. J. High Energy Phys. 2016(1), 1–24 (2016)
MathSciNet Google Scholar
Yang, Z.-C., Li, Y., Fisher, M., Chen, X.: Entanglement phase transitions in random stabilizer tensor networks. arXiv preprint arXiv:2107.12376 (2021)
You, Y.-Z., Yang, Z., Qi, X.-L.: Machine learning spatial geometry from entanglement features. Phys. Rev. B 97(4), 045153 (2018)
ADS Google Scholar

Download references

Acknowledgements

NC is supported in part by the Department of Energy via the GeoFlow consortium (QuantISED Award DE-SC0019380). CL acknowledges support from the projects ESQuisses (ANR-20-CE47-0014-01), STARS (ANR-20-CE40-0008), Qtraj (ANR-20-CE40-0024-01), and Random Tensors (ANR-11-LABX-0040) of the French National Research Agency (ANR). GP is supported by the UC Berkeley Physics Department, the Simons Foundation through the “It from Qubit” program, the Department of Energy via the GeoFlow consortium (QuantISED Award DE-SC0019380), and AFOSR award FA9550-22-1-0098. He also acknowledges support from an IBM Einstein Fellowship at the Institute for Advanced Study. MW acknowledges support by the NWO through Grant OCENW.KLEIN.267, by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2092 CASA - 390781972, by the BMBF through project Quantum Methods and Benchmarks for Resource Allocation (QuBRA), and by European Research Council (ERC), funded by the European Union. Views and opinions expressed are, however, those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.

Funding

Open access funding provided by Royal Library, Copenhagen University Library.

Author information

Authors and Affiliations

Department of Physics, Center for Theoretical Physics, University of California, Berkeley, USA
Newton Cheng & Geoff Penington
Institut Fourier and CNRS, Université Grenoble Alpes, Gières, France
Cécilia Lancien
Institute for Advanced Study, Princeton, USA
Geoff Penington
Faculty of Computer Science, Ruhr University Bochum, Bochum, Germany
Michael Walter
Department of Mathematical Sciences and QMATH, University of Copenhagen, Copenhagen, Denmark
Freek Witteveen

Authors

Newton Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Cécilia Lancien
View author publications
You can also search for this author in PubMed Google Scholar
Geoff Penington
View author publications
You can also search for this author in PubMed Google Scholar
Michael Walter
View author publications
You can also search for this author in PubMed Google Scholar
Freek Witteveen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Freek Witteveen.

Additional information

Communicated by David Pérez-García.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Euclidean Gravity Path Integrals and Entanglement Spectra

In this appendix, we give a heuristic description of certain Euclidean gravity path integrals in holography whose descriptions are in close analogy with random tensor network models. This section serves as a motivation for the random tensor network models we study, but is not needed to understand the random tensor network results. In “Appendix A.1,” we review the replica trick in quantum field theory and the role of Euclidean path integrals. We review an application of such tools in “Appendix A.2,” and we discuss the problem of studying the entanglement entropy near a phase transition between two minimal surfaces, which agrees with results in Sect. 4. Then, in “Appendix A.3,” we discuss a simplified model of quantum gravity which is in very close correspondence to a random tensor network model with link states with bounded spectral variation, as in Sect. 3.

1.1 The Replica Trick and Euclidean Path Integrals

In order to compute entropies in quantum field theory, one often uses a version of the replica trick to compute the Rényi entropies, which can then be analytically continued to deduce von Neumann entropies. We consider a pure quantum field theory state $\vert \rho \rangle $ on a space M, which is prepared by a Euclidean path integral on $M \times (-\infty ,0]$. Correspondingly, $\langle \rho \vert $ is prepared by the time-reflected path integral on $M \times [0,\infty )$. Let A be a subregion of M. The reduced density matrix on A is given by taking $\vert \rho \rangle $ and $\langle \rho \vert $, then integrating over the field configurations on the complement of A (the equivalent of the partial trace for field theories), as shown in Fig. 6a. Analogous to Eq. (2.8), we may now compute ${{\,\textrm{tr}\,}}[\rho _A^k]$ by taking k copies of this path integral, and gluing the boundaries at the A system cyclically, then integrating over the field configurations at each boundary. This operation is manifestly invariant under cyclic permutations, a symmetry called replica symmetry. This is illustrated in Fig. 6b. We conclude that ${{\,\textrm{tr}\,}}[\rho _A^k]$ is computed by a path integral $Z_{A,k}$ on a manifold $M_{A,k}$, allowing us to compute $H_k(\rho _A)$. The space $M_{A,k}$ is a k-fold cover of $M \times \mathbbm {R}$, branching at the boundary $\partial A$ of the subregion A.

Of course, the path integral $Z_{A,k}$ is formally infinite. One “normalizes” the path integral by normalizing by $Z_1$ which can be thought of as ${{\,\textrm{tr}\,}}[\rho ]$. We have the expression for the kth Rényi entropy:

$$\begin{aligned} H_k(\rho _A) = \log \frac{Z_{A,k}}{Z_1^k}. \end{aligned}$$

(A.1)

Note that this is a slightly different normalization convention (the denominator is $Z_1^k$ instead of $Z_1$) to match the standard convention in the quantum gravity literature. To get a finite result, one has to impose a UV cutoff of size $\varepsilon $, an aspect we will ignore in our discussion. By employing such a cutoff, we will pretend that the relevant Hilbert spaces are finite-dimensional Hilbert spaces that factorize with respect to spatial decompositions of M.

What happens if the quantum field theory is a holographic CFT? In this case, we have a correspondence between the path integral of the CFT on the one hand, and a bulk quantum gravity path integral on the other hand, which, for large effective central charge, we may approximate by its semiclassical saddle point geometry. This can be used to derive the Ryu–Takayanagi formula [45], by finding a bulk manifold $B_{A,k}$ which has $M_{A,k}$ as its boundary. In other words, one obtains the gravitational dual by setting $M_{A,k}$ as a boundary condition, then allowing the system to evolve according to the gravitational theory.

The saddle point solution will be such that the bulk copies are glued along a cyclic permutation in an area adjacent to A and along the identity permutation in an area adjacent to ${\bar{A}}$, the boundary complement of A. The boundary between these two regions is a surface $\gamma _{A,k}$ anchored at the boundary at $\partial A$. After orbifolding with respect to the replica symmetry, this leads to a bulk spacetime which has M as its boundary and a “cosmic brane” at the surface $\gamma _{A,k}$. The orbifold procedure gives rise to a conical deficit, corresponding to the fixed points under the action of the replica symmetry. Comparing the action of $B_{A,k}$ and $B_1$, we find that in the computation of Eq. (A.1), only the conical deficit at $\gamma _{A,k}$ does not cancel, yielding

$$\begin{aligned} H_k(\rho _A) \approx \frac{\left|\gamma _{A,k}\right|}{4G_N}. \end{aligned}$$

(A.2)

where $\left|\gamma _{A,k}\right|$ is the area of the brane in the saddle point solution. This can be analytically continued to non-integer values, and in particular, continuation to $k = 1$ yields the RT formula:

$$\begin{aligned} H(\rho _A) \approx \frac{\left|\gamma _{A}\right|}{4G_N}. \end{aligned}$$

(A.3)

where $\gamma _A$ is the surface of minimal area out of all bulk surface homologous to A and anchored at $\partial A$.

The spectrum of $\rho _A$ can be recovered from the Rényi entropies. In Eq. (A.2), we see that $\left|\gamma _{A,k}\right|$ depends on k, but is otherwise fixed as we let $G_N$ go to zero in the classical limit. It was argued in [14] that this behavior implies a spectrum that is flat at leading order:

$$\begin{aligned} H_{\min }^{\varepsilon }(\rho _A)&= H(\rho _A) - {\mathcal {O}}(G_N^{-\frac{1}{2}}) = H(\rho _A) - {\mathcal {O}}(\sqrt{H(\rho _A)})\\ H_{\max }^{\varepsilon }(\rho _A)&= H(\rho _A) + {\mathcal {O}}(G_N^{-\frac{1}{2}}) = H(\rho _A) + {\mathcal {O}}(\sqrt{H(\rho _A)}) \end{aligned}$$

This is precisely the link state regime we investigated in Sect. 4.4, and shows that the large c limit in a holographic CFT is similar to considering the many-copy limit in quantum information theory.

1.1.1 Fixed-Area States

A useful variation on the derivation of the RT formula is to consider fixed-area states [27]. We consider the “area operator” ${\hat{\gamma }}_A$ for a subsystem A, which measures the area of a minimal surface. The operator ${\hat{\gamma }}_A$ actually has fluctuations when we consider a state with a semiclassical gravity dual, and we may write

$$\begin{aligned} \vert \rho \rangle = \int \textrm{d}\alpha \vert \psi _{\alpha }\rangle \end{aligned}$$

where $\psi _{\alpha }$ is an eigenvector of ${{\hat{A}}}$ with eigenvalue $\alpha $. The state $\vert \psi _\alpha \rangle $ is a fixed-area state: While it is not itself a physical state, they form a basis with which to construct physical states. These fixed-area states can be thought of as prepared by a bulk path integral where we have restricted to bulk geometries for which $\left|\gamma _A\right| = \alpha $. The same derivation as above now leads to

$$\begin{aligned} H_k(\rho _A) \approx \frac{\alpha }{4G_N}, \end{aligned}$$

since the area of the minimal surface is fixed to be $\alpha $. Thus, for fixed-area states, all Rényi entropies are (to good approximation) equal, which implies that the state has flat entanglement spectrum. This corresponds to a random tensor network state with maximally entangled link states.

1.2 States at a Minimal Surface Phase Transition

Consider, again, a holographic CFT state $\rho _A$ on a boundary subregion A. In the case where there is a unique minimal surface, the previous subsection showed how the replica trick leads to the RT formula.

We would now like to investigate the entanglement spectrum of $\rho _A$ when there are two RT surfaces for A that have area of the same order of magnitude. Let us denote the two competing minimal surfaces in the bulk by $\gamma _{A,1}$ and $\gamma _{A,2}$. In [54], it was shown how the entanglement entropy should behave at this phase transition between the two minimal surfaces. A similar computation was performed in [6] for the setting with two competing minimal surfaces and bulk matter. We will briefly sketch their argument, referring the interested reader to [6] for more details.

1.2.1 Fixed-Area States with Two Minimal Surfaces

We begin by considering bipartite fixed-area states $\rho _{A{\bar{A}}}$ prepared by a Euclidean path integral, in which we have fixed the size of the two competing surfaces $\gamma _{A,1}$ and $\gamma _{A,2}$. In this case, saddle points of the path integral have to satisfy the equations of motion everywhere except at the surfaces $\gamma _{A,i}$, where there could be conical singularities. The two surfaces divide the bulk into 3 regions: $a_1$, $a_2$, and $a_3$.

The saddle points are states with smooth geometries in the regions where the copies of regions $a_i$ are glued to each other—the k copies of region $a_1$ are glued cyclically, while the copies of region $a_3$ are glued along the identity permutation. On the middle region $a_2$, we are free to glue along an arbitrary permutation $\pi $, giving solutions that can break the replica symmetry. However, it is often the case that the dominant solutions of the path integral are those that respect the replica symmetry, and we assume this to be the case for all solutions we consider.

Let us denote a saddle point solution of the path integral with permutation $\pi $ by $B_{A,\pi }$, and let us write $\phi _i$ for the conical singularity angle at $\gamma _{A,i}$. It turns out that these saddle points lead to an action of the form

$$\begin{aligned} I(B_{A,\pi })&= k I_{\text {away}}(B_{A,\pi }) + (k\phi _1 - 2\pi \left|C(\pi )\right|)\frac{\left|\gamma _{A,1}\right|}{8\pi G_N} \\ {}&\quad + (k\phi _2 - 2\pi \left|C(\tau ^{-1}\pi )\right|)\frac{\left|\gamma _{A,2}\right|}{8\pi G_N}, \end{aligned}$$

where $I_{\text {away}}$ is the action away from the surfaces, $\left|\gamma _{A,i}\right|$ is the area of the surface $\gamma _{A,i}$, and we recall that $\left|C(\pi )\right|$ is the number of cycles of $\pi $, and $\tau $ is the full cycle $(1 2 \ldots k)$. In particular, for $k=1$, we have

$$\begin{aligned} I(B_{A,\pi }) = I_{\text {away}}(B_{A,\pi }) + (\phi _2 - 2\pi )\frac{\left|\gamma _{A,1}\right|}{8\pi G_N} + (\phi _2 - 2\pi )\frac{\left|\gamma _{A,2}\right|}{8\pi G_N}, \end{aligned}$$

so when we look at the normalized path integral and sum over all permutations

$$\begin{aligned} \frac{Z_{A,k}}{(Z_{A,1})^k}&\approx \sum _{\pi \in S_k} e^{(\left|C(\pi )\right| - k)\frac{\left|\gamma _{A,1}\right|}{4G_N} + (\left|C(\tau ^{-1}\pi )\right| - k)\frac{\left|\gamma _{A,2}\right|}{4G_N}} \\ {}&= \sum _{\pi \in S_k} e^{d(\textrm{id},\pi )\frac{\left|\gamma _{A,1}\right|}{4G_N} + d(\pi ,\tau )\frac{\left|\gamma _{A,2}\right|}{4G_N}}. \end{aligned}$$

Note that the areas $\gamma _{A,i}$ are of the same order of magnitude, and are divergent. As a result, only the permutations for which $d(\textrm{id},\pi ) + d(\pi ,\tau )$ is minimal will contribute. In other words, only the configurations where $\pi \in NC(k)$ contribute, as all other permutations are suppressed by at least a factor of the area of $\gamma _{A,i}$ in the action. We conclude that

$$\begin{aligned} \frac{Z_{A,k}}{(Z_{A,1})^k} \approx \sum _{\pi \in NC(k)} e^{d(\textrm{id},\pi )\frac{\left|\gamma _{A,1}\right|}{4G_N} + d(\pi ,\tau )\frac{\left|\gamma _{A,2}\right|}{4G_N}}. \end{aligned}$$

(A.4)

This computation is in one-to-one correspondence with the computation of the moments for a subsystem of a single random tensor, as observed in [60]. It also corresponds more generally to a random tensor network computation with two minimal cuts and maximally entangled link states, as is clear from the computations in Sects. 2 and 3. This is in agreement with the claim that random tensor network states are a model for fixed-area states. One can also add bulk matter in this path integral computation, which will again be in correspondence to a similar computation in a random tensor network [6] with a background state, as in Sect. 2.3. From the moment computation in Eq. (A.4) and applying the results for the entanglement of a single random tensor, we observe that for two fixed surfaces of exactly equal size, the (appropriately scaled) entanglement spectrum is a Marchenko–Pastur distribution, giving an ${\mathcal {O}}(1)$ correction to the entanglement entropy, agreeing with the gravitational replica trick computation in [54].

1.2.2 General States at the Minimal Surface Phase Transition

We now relax the fixed-area restriction, and study similar calculations performed in [27, 54], and [6]. Denote by $Z_{A,k}(\alpha _1, \alpha _2)$ the path integral where we have fixed the areas of $\gamma _{A,i}$ to be $\alpha _i$. Then, following Section 2.3 in [27], the full path integral is given by

$$\begin{aligned} Z_{A,k} = \int \textrm{d}\alpha _1 \textrm{d}\alpha _2\ Z_k(\alpha _1, \alpha _2). \end{aligned}$$

Again, we consider the semiclassical limit, so we take our saddle point approximation of $Z_k(\alpha _1, \alpha _2)$ in Eq. (A.4), and we also take a saddle point approximation for the integral over $\alpha _1$ and $\alpha _2$. This saddle point will be at the values for $\alpha _i$ where the deficit angles are $\phi _i = \frac{2\pi }{n}$ (since then the saddle point geometry is smooth), which leads to

$$\begin{aligned} \frac{Z_{A,k}}{(Z_{A,1})^k} \approx \sum _{\pi \in NC(k)} e^{(\left|C(\pi )\right| - k)\frac{\left|\gamma ^{(k)}_{A,1}\right|}{4G_N} + (\left|C(\tau ^{-1}\pi )\right| - k)\frac{\left|\gamma ^{(k)}_{A,2}\right|}{4G_N}} \end{aligned}$$

(A.5)

where $\gamma ^{(k)}_{A,i}$ are now minimal surfaces, with a dependence on k. Analytic continuation to $k = 1$ yields the usual surface prescription. In particular, if there are two surfaces that are of almost equal area, the contribution of the larger term is exponentially suppressed for any ${\mathcal {O}}(1)$ or larger difference in areas.

To zoom in on the region where the two surfaces are nearly equal, we write the state as a superposition of fixed-area states, as in Section 3 of [54]. We discretize the area size $\alpha _1$ and $\alpha _2$ over ${{\,\textrm{poly}\,}}(\frac{1}{G_N})$ values and approximate the full holographic pure state with boundary regions A and $B = A^c$ as a (finite) sum

$$\begin{aligned} \vert \psi \rangle _{AB} = \sum _{\alpha _1,\alpha _2} \sqrt{p(\alpha _1,\alpha _2)} \vert \psi _{\alpha _1,\alpha _2}\rangle \end{aligned}$$

where $\vert \psi _{\alpha _1,\alpha _2}\rangle $ is the state where the areas are fixed as $\left|\gamma _{A,i}\right| = \alpha _i$, and p is a probability distribution over the possible areas. Then, a straightforward calculation of the reduced density matrix $\rho _A$ yields a state of the form:

$$\begin{aligned} \rho _A&= \sum _{\alpha _1,\alpha _2}p(\alpha _1,\alpha _2)\rho _{A,\alpha _1,\alpha _2} + \sum _{\alpha _1\ne \alpha _1', \alpha _2\ne \alpha _2'}\sqrt{p(\alpha _1,\alpha _2)p(\alpha _1',\alpha _2')}{{\,\textrm{tr}\,}}_B\\ {}&\quad \left( \vert \psi _{\alpha _1,\alpha _2}\rangle \langle \psi _{\alpha _1',\alpha _2'}\vert \right) ,\\&= \sum _{\alpha _1,\alpha _2}p(\alpha _1,\alpha _2)\rho _{A,\alpha _1,\alpha _2} + OD_A, \end{aligned}$$

where $OD_A$ are the off-diagonal elements of $\rho _A$. One can argue that the states $\rho _{A,\alpha _1,\alpha _2}$ are all mutually orthogonal by entanglement wedge reconstruction—the area operator can be reconstructed on A, and hence, each $\rho _{A,\alpha _1,\alpha _2}$ is perfectly distinguishable from each other by measuring the area operator. Then, the entropy of the diagonal part of the state is easily computed as

$$\begin{aligned} H\left( \sum _{\alpha _1,\alpha _2}p(\alpha _1,\alpha _2)\rho _{A,\alpha _1,\alpha _2}\right)= & {} \sum _{\alpha _1,\alpha _2}p(\alpha _1,\alpha _2)H(\rho _{A,\alpha _1,\alpha _2})\nonumber \\{} & {} - \sum _{\alpha _1,\alpha _2}p(\alpha _1,\alpha _2)\log p(\alpha _1,\alpha _2). \end{aligned}$$

(A.6)

The second term is the so-called entropy of mixing, and it is a standard argument that this term is suppressed relative to the first term [54] as ${\mathcal {O}}(\ln G_N)$ or smaller. The entropies appearing in the first term can be computed using the methods in the previous subsection, for which one finds that $H(\rho _{A,\alpha _1,\alpha _2}) = \frac{\min \{\alpha _1, \alpha _2\}}{4G_N}$.

Returning to the off-diagonal terms $OD_A$, [54] argued that such terms should be subleading in the analytic continuation due to the relevant surfaces breaking replica symmetry. At the same time, [6] argued that such terms should be subleading due to reasons similar to those for the orthogonality of the diagonal elements: Complementary entanglement wedge reconstruction implies one may reconstruct the bulk area operator on B, and hence, such states are perfectly distinguishable on B. Therefore, the partial trace over B vanishes for $\alpha _1\ne \alpha _1'$, $\alpha _2\ne \alpha _2'$.

One reaches the conclusion:

$$\begin{aligned} H(\rho _A) = \sum _{\alpha _1,\alpha _2} p(\alpha _1, \alpha _2)\frac{\min \{\alpha _1, \alpha _2\}}{4G_N} + {\mathcal {O}}(\ln G_N), \end{aligned}$$

(A.7)

In this computation, the ${\mathcal {O}}(1)$ corrections due to the Marchenko–Pastur distribution along each pair of minimal cuts of equal size (or equivalently, the degeneracy in the contributions to the saddle point approximation) is irrelevant, as the entropy of mixing already leads to ${\mathcal {O}}(\ln G_N)$ deviations.

Our results in Sect. 4 can be seen as a rigorous version of the above result for random tensor networks.

1.3 Replica Wormholes and JT Gravity

One of the most basic holographic models of quantum gravity is JT gravity, a $1+1$-dimensional model of gravity; see [67] for a review. JT gravity is also a useful model for the near-horizon dynamics of extremal black holes in any dimension. In this case, the dual theory should be $0+1$-dimensional. In other words, it should be regular quantum mechanics rather than a quantum field theory. Indeed, in [68], it was shown that JT gravity theory is dual to a random matrix model, where the Hamiltonian is a random self-adjoint matrix according to some distribution, providing another strong connection between quantum gravity and random matrix theory. It also appears that such gravitational systems may be dual to an ensemble of boundary theories [16], rather than a single one. Whether this is fundamental, a special feature of 1+1-dimensional models, or due to averaging over microscopic features of the gravity theory, is a line of active research [69].

We now sketch a variation on a calculation in [60], providing a proof of principle that the free probability techniques used in Sect. 3 provide an elegant framework to understand such results. We refer the interested reader to [60] for more in-depth motivation and detailed computations.

We consider JT gravity with an end of the world (EOW) brane containing a large number n of internal states. This model has action

$$\begin{aligned} I = I_{{{\,\textrm{JT}\,}}} + \mu \int _{\text {brane}} \textrm{d}s, \end{aligned}$$

where the action of a manifold M with metric g, induced boundary metric h, (trace of) extrinsic curvature K, and dilaton $\phi $ is given by

$$\begin{aligned} I_{{{\,\textrm{JT}\,}}}[M, g]&= -\frac{S_0}{2\pi }\left[ \frac{1}{2}\int _{M} \sqrt{g} R + \int _{\partial M} \sqrt{h} K \right] \\ {}&\quad - \left[ \frac{1}{2}\int _{M} \sqrt{g}\phi (R + 2) + \int _{\partial M} \sqrt{h}\phi K \right] . \end{aligned}$$

The details of this action are not very important for us; we just note that we will take the $S_0$ parameter to be large, and that this suppresses contributions where the manifold M has genus $\gamma > 0$ in the Euclidean path integral.

Such systems are of interest when studying a simple version of an evaporating black hole. Let

$$\begin{aligned} \vert \psi \rangle = \frac{1}{\sqrt{n}} \sum _{i = 1}^n \vert \psi _{B,i}\rangle \vert i_{{{\,\textrm{Rad}\,}}}\rangle \end{aligned}$$

where $\vert \psi _{B,i}\rangle $ is the state of the black hole with the EOW brane in state i, and $\vert i_{{{\,\textrm{Rad}\,}}}\rangle $ is a reference state, which can be thought of as the radiation system. Notice that the entanglement spectrum of this state is flat. We generalize this to

$$\begin{aligned} \vert \psi \rangle = \sum _{i = 1}^n \sqrt{p_i} \vert \psi _{B,i}\rangle \vert i_{{{\,\textrm{Rad}\,}}}\rangle , \end{aligned}$$

where the entanglement between the black hole and the radiation has some non-trivial spectrum, which we will assume to be close to uniform, so that $\tfrac{p_i}{n}$ is bounded by a constant for all i. In other words, we assume this distribution satisfies the bounded spectral variation assumption from Sect. 3. We let

$$\begin{aligned} m_k = \sum _{i=1}^n n^{k-1} p_i^k \end{aligned}$$

be the (appropriately scaled) moments of the entanglement spectrum of the EOW brane. Moreover, we write the path integral on a disk geometry with k boundary components and k EOW branes as $e^{-S_0}Z_k$. Then following the arguments of [60], one can compute the kth moment of the radiation system for large n and large $e^{S_0}$ (large n enforces a planar limit with only non-crossing partitions, while large $e^{S_0}$ ensures that only genus $\gamma =0$ geometries contribute), as illustrated in Fig. 7b. The contributions of path integral configurations connecting different replicas are called replica wormholes. This diagrammatic computation shows that

$$\begin{aligned} {{\,\textrm{tr}\,}}[\psi _{{{\,\textrm{Rad}\,}}}^k] = \sum _{\pi \in NC(k)} m_{\pi } \frac{Z_{\pi ^{-1}\tau }}{Z_1^k} n^{-d(\pi ,\textrm{id})}e^{-S_0 d(\pi ,\tau )}. \end{aligned}$$

(A.8)

In this expression, recall that $Z_{\sigma } = \prod _{l \in C(\sigma )} Z_{l}$, where $C(\sigma )$ is the cycle type of $\sigma $, and $l \in C(\sigma )$ are the lengths of the cycles of $\sigma $. This expression implies that if $n \gg e^{S_0}$, the dominant contribution in Eq. (A.8) is given by $\pi = \tau $. On the other hand, if $n \ll e^{S_0}$, the dominant contribution in Eq. (A.8) is given by $\pi = \textrm{id}$. This corresponds to the situation where there is a unique minimal surface (more precisely, a unique quantum extremal surface). We are interested in the regime at the phase transition, which is analogous to the Page time of an evaporating black hole, so we assume $ne^{-S_0} \rightarrow 1$. The coefficients $m_{\sigma }$ correspond to the weight of the $\sigma $ configuration, as determined by the number and length of the cycles in $\sigma $, and the probability distribution of eigenstates $p_i$. In the case of the flat entanglement spectrum, this number equals the number of closed loops between the connected components. This will also be the case for the non-trivial entanglement spectrum, but each loop will have a different weight that depend on the $p_i$’s.

The $m_k$ are the (scaled) moments of a probability distribution. While the explicit expression itself is not important for our purposes, the $Z_l$ can be written as the lth moments of a probability distribution [60]. Hence, Eq. (A.8) is a product of moments, summed over all non-crossing partitions of length k. As a result, ${{\,\textrm{tr}\,}}[\psi _{{{\,\textrm{Rad}\,}}}^k]$ can be calculated in the planar limit very simply by way of free probability theory.

More precisely, we may define moment-generating functions for ${{\,\textrm{EOW}\,}}$, ${{\,\textrm{JT}\,}}$, and ${{\,\textrm{Rad}\,}}$:

$$\begin{aligned} M_{{{\,\textrm{JT}\,}}}(z)&= \sum _{k=1}^\infty \frac{Z_{k}}{Z_1^k} z^k, \\ M_{{{\,\textrm{EOW}\,}}}(z)&= \sum _{k = 1}^{\infty } m_k z^k,\\ M_{{{\,\textrm{Rad}\,}}}&= \sum _{k=1}^\infty e^{S_0(k-1)} {{\,\textrm{tr}\,}}[\psi _{{{\,\textrm{Rad}\,}}}^k] z^k. \end{aligned}$$

Given a moment-generating function M(z), which is a formal power series, recall that its S-transform is given by

$$\begin{aligned} S(z) = \frac{1+z}{z}M^{-1}(z). \end{aligned}$$

We use this to define the S-transforms $S_{{{\,\textrm{JT}\,}}}$, $S_{{{\,\textrm{EOW}\,}}}$ and $S_{{{\,\textrm{Rad}\,}}}$. From Theorem 3.2, we see that the relation between the moments in Eq. (A.8) implies that these are related as

$$\begin{aligned} S_{{{\,\textrm{Rad}\,}}}(z) = \frac{1}{1 + z}S_{{{\,\textrm{JT}\,}}}(z) S_{{{\,\textrm{EOW}\,}}}(z). \end{aligned}$$

(A.9)

This means that, at the phase transition where $n \approx e^{S_0}$, the spectrum of $\psi _{{{\,\textrm{Rad}\,}}}$ can be described as a free product of the spectra of the end of the world brane, the JT gravity spectrum and a Marchenko–Pastur distribution.

1.3.1 A Recursion Relation for the Resolvent

Given a moment-generating function M(z), we may also define the resolvent function R(z) by

$$\begin{aligned} R(z) = \frac{1}{z}\left( 1 + M\left( \frac{1}{z}\right) \right) . \end{aligned}$$

To relate to previous results, we consider the case where the entanglement with the radiation is maximally entangled. In this case, $S_{{{\,\textrm{EOW}\,}}}(z) = 1$ and $S_{{{\,\textrm{Rad}\,}}}(z) = \frac{1}{1 + z}S_{{{\,\textrm{JT}\,}}}(z)$. By definition of the S-transform and setting $z \rightarrow M_{{{\,\textrm{Rad}\,}}}(z)$, this implies

$$\begin{aligned} \frac{1}{1 + M_{{{\,\textrm{Rad}\,}}}(z)}S_{{{\,\textrm{JT}\,}}}(M_{{{\,\textrm{Rad}\,}}}(z)) = \frac{1 + M_{{{\,\textrm{Rad}\,}}}(z)}{M_{{{\,\textrm{Rad}\,}}}(z)}z, \end{aligned}$$

which we may rewrite as (again using the definition of the S-transform):

$$\begin{aligned} M_{{{\,\textrm{Rad}\,}}}(z) = M_{{{\,\textrm{JT}\,}}}[z(1 + M_{{{\,\textrm{Rad}\,}}}(z))]. \end{aligned}$$

In terms of the resolvent, this becomes

$$\begin{aligned} R(z)&= \frac{1}{z} + \frac{1}{z}M_{R}\left( \frac{1}{z}\right) \\&= \frac{1}{z} + \frac{1}{z}M_{{{\,\textrm{JT}\,}}}(R(z))\\&= \frac{1}{z} + \sum _{k=1}^\infty \frac{Z_{k}}{Z_1^k}\frac{R(z)^k}{z}, \end{aligned}$$

which is a recursion relation previously derived in [60] by a diagrammatic argument. More generally, we can interpret Eq. (A.9) as a (complicated) recursion relation that directly generalizes the above recursion relation.

Random Tensor Network States and Split Transfer Protocols

Our results involving general background states are closely related to the quantum-information-theoretic task of split transfer introduced in [26], which can be understood as a variant of quantum state merging. The standard setup is as follows: two parties, Alice and Bob, share a state $\phi _{ABC_1\ldots C_m}$, with Alice controlling A, Bob controlling B, and the systems $C_1,\ldots , C_m$ being m “helpers.” Let R be a reference system and $\phi _{ABRC_1\ldots C_m}$ a purification of $\phi $. Initially, the state is shared not only by Alice and Bob, but also with all the helper systems $C_i$. The goal of split transfer is to try to redistribute the state to Alice, Bob, and R, using local quantum operations and classical communication (LOCC) between Alice, Bob and the helper systems, and possibly with the assistance of additional maximally entangled states.

A split transfer protocol consists of

1.
A partitioning of the set of the helper systems: $T_A \sqcup T_B = \{C_1,\ldots ,C_m\}$.
2.
For each $C_i \in T_A$, a number $K_{A,i}$ of shared maximally entangled qubits between Alice and $C_i$, and for each $C_i \in T_B$, a number $K_{B,i}$ of shared maximally entangled qubits between Bob and $C_i$.
3.
An LOCC operation between Alice, Bob and the helper systems, such that after applying the protocol, Alice and Bob share a state $\psi _{ABRC_1\ldots C_m}$, which is close to $\phi _{ABRC_1\ldots C_m}$, and is such that Alice possesses systems A and $T_A$, while Bob controls B and $T_B$. Moreover, after applying the protocol they may be in possession of a number $L_{A,i}$ or $L_{B,i}$ of (approximately) maximally entangled qubits between $C_i$, and, respectively, A or B.

In this case, we say that the split transfer protocol has entanglement costs $K_{A,i} - L_{A,i}$ for all $C_i \in T_A$ and $K_{B,i} - L_{B,i}$ for all $C_i \in T_B$. A precise definition can be found as Definition 14 in [26].

Intuitively, the helper systems need to transfer their correlations with R to Alice and Bob, but without touching R. For instance, a simple protocol would be that the helpers simply teleport their full system to either Alice or Bob, consuming EPR pairs, leading to large entanglement costs. We can construct a potentially much more efficient protocol by way of random measurements, as detailed in Proposition 16 of [26]. Roughly speaking, such a protocol functions because random measurements have the effect of decoupling the helper systems from R. The helpers perform simultaneous random measurements on their systems and send the results of their measurements to Alice and Bob. Then, Alice and Bob can use their share of the global state and their portions of the maximally entangled states to perform a decoding operation conditioned on the results of the random measurements. The state they receive will be a purification of $\psi _{ABC_1\ldots C_m}$, which will then be equivalent to the original $\psi _{ABRC_1\ldots C_m}$ up to local isometries. The way we set up the split transfer protocol above was in a one-shot fashion: We get a single copy of $\phi $ and need to determine the optimal entanglement cost for the protocol.

One can also consider asymptotic variants of split transfer, where one has many copies available and aims to achieve an optimal transfer rate. An example application is the entanglement of assistance. Suppose that Alice, Bob, and the helper systems $C_i$ get many copies of a pure state $\phi $. At what rate can they distill maximally entangled pairs between Alice and Bob, if Alice and Bob are allowed to perform LOCC operations with all the helper systems? In this case, the answer is that the rate is given by

$$\begin{aligned} \min _{T_A} S(AT_A)_{\phi }, \end{aligned}$$

that is, by minimizing the entanglement entropy over all bipartitions. This rate is reminiscent of the importance of minimal cuts in a random tensor network, and the connection was explored in [40].

To see how the task of split transfer relates to random tensor networks, we consider three boundary regions, A and B, under the control of Alice and Bob, and the purifying system R. Each of the bulk vertices on the network correspond to a “helper” party. We would like to know whether there exists a protocol in which the assisting parties are allowed to perform local operations and classical communication (LOCC) such that the state $\phi _{VR}$ is redistributed into a state $\rho _{ABR}$ held by Alice and Bob that can be transformed by local isometries, acting only on A and B, to a state close to $\phi _{VR}$. The protocol given in [26] consists of simultaneous random measurements by each of the helpers. This precisely corresponds to the random projections performed in constructing the random tensor network state with this background state!

In this light, we can interpret Theorem 4.4 as a result on split transfer. Let us assume that $\phi \in {\mathcal {P}}_{\scriptscriptstyle {=}}(A B R C_1 \ldots C_n)$, denote the associated random tensor network state by $\rho _{ABR}$, and choose a partitioning $T_A \sqcup T_B$ of the assisting (bulk) parties. Since $H_2(A \vert B)_{\phi \vert \phi } \ge H_{\min }(A \vert B)_{\phi }$, Theorem 4.4 directly yields

Theorem

Suppose that

$$\begin{aligned} H_{\min }(S_A \vert B R T_B)_{\phi } \ge K_1 \end{aligned}$$

(B.1)

for all non-empty subsets $S_A \subseteq T_A$ and

$$\begin{aligned} H_{\min }(S_B \vert A R T_A)_{\phi } \ge K_2 \end{aligned}$$

(B.2)

for all non-empty subsets $S_B \subseteq T_B$. Then,

$$\begin{aligned}&\mathbbm {E}\min _{V_A, V_{B}} \Vert (V_A \otimes V_{B} \otimes I_R)\rho (V_A^\dagger \otimes V_{B}^\dagger \otimes I_R) - \phi _{ABR C_1 \ldots C_n}\Vert _1 \nonumber \\ {}&\quad = {\mathcal {O}}((2^{-\frac{1}{4}K_1} + 2^{-\frac{1}{4}K_2}). \end{aligned}$$

(B.3)

where the minimum is over isometries $V_{A}: {\mathcal {H}}_A \rightarrow {\mathcal {H}}_{A T_A}$ and $V_{B}: {\mathcal {H}}_{B} \rightarrow {\mathcal {H}}_{B T_B}$.

This result shows that if $K_1$ and $K_2$ are sufficiently large, then after measurement in a random basis, the state possessed by Alice and Bob can, with high probability, be used to approximately reconstruct $\phi $ by acting with local isometries on the systems of Alice and Bob.

Suppose we fix values for $K_1$ and $K_2$. If the conditions in Eqs. (B.1) and (B.1) are not satisfied for the initial state $\phi $, we can use another state where we have added an appropriate number of maximally entangled Bell pairs between the assisting parties, increasing the entanglement cost of the protocol.^{Footnote 5} An interesting open question in this context is whether one can generalize this result using smooth entropies in Eqs. (B.1) and (B.1). As alluded to in Sect. 4, the problem is that for a general state $\phi $, one would need to perform simultaneous smoothing for all the relevant subsystems, which remains an open problem.

There is an alternative approach, in which it is straightforwardly possible to use smooth entropies [26]. In this approach, one merges each party in $T_A$ one by one, and similarly for $T_B$. That is, we choose some ordering $T_A = \{1, \ldots , m\} = [m]$ and we apply a sequence of state merging protocols where we merge the state in m steps, where a single step merges $A \cup T_A {\setminus } [i-1]$ into $A \cup T_A {\setminus } [i]$. In this case, it is not hard to see that, if we allow some error the entanglement cost is determined by the smooth conditional entropies $H_{\min }^\varepsilon (\{i+1\} \vert B R [i] T_B)_{\phi }$. We perform a similar protocol for B and the assisting $T_B$ systems.

1.1 Split Transfer and Recovery in Holography

Split transfer is closely related to subregion-subregion duality, or entanglement wedge reconstruction, in holography. Consider an asymptotically AdS, stationary semiclassical geometry which is dual to a boundary CFT state $\rho $. We fix a time-reversal invariant spatial slice and partition the boundary into A and ${\bar{A}}$. Let $\gamma _A$ be the minimal surface for A, and recall that $\partial \gamma _A = \partial A$. The region enclosed by A and $\gamma _A$ is the entanglement wedge of A, which we denote by $\Gamma _A$. The claim of entanglement wedge reconstruction is that if we act with a low-energy local bulk operator in the entanglement wedge of A, we can reconstruct the action of this operator as a corresponding operator acting on the boundary system A.

One way to make this more precise is by considering a code subspace of bulk states S, which can be thought of as a set of states obtained by acting with low-energy operators on a fixed semiclassical spacetime. The action of the operators is small, in the sense that they do not create a significant backreaction that changes the geometry. We then introduce a reference system R of the same dimension as S. In this framework, the AdS/CFT correspondence describes an encoding of the bulk into the boundary, taking a bulk state $\phi _{\Gamma _A\Gamma _A^cR}$ to a boundary state $\rho _{A {{\bar{A}}} R}$. In this setup, the claim of entanglement wedge reconstruction is that we can act with an isometry on A to recover $\phi _{\Gamma _A R}$. (And $\Gamma _A$ is actually the maximal such region.) This situation corresponds to quantum state merging—there is a single decoder A. However, in AdS/CFT, one usually requires the stronger condition of complementary entanglement wedge reconstruction, in which the entanglement wedge for ${\bar{A}}$ is also the complement of $\Gamma _A$, so that $\phi _{\Gamma _A^c R}$ is recoverable from ${\bar{A}}$. This stronger requirement with two decoders is closely related to split transfer. These ideas and the precise relation to quantum information theory remain an active area of research, e.g., [6, 7, 36, 40].

In particular, [6, 7, 14] have argued that one-shot quantum information is the correct framework to understand entanglement wedge reconstruction, as holography is fundamentally a one-shot setting—we are provided with a single copy of a gravitational or CFT state, rather than asymptotically many copies.

The limit of large effective central charge, and hence small $G_N$, does reproduce certain aspects of the many-copy limit, as we reviewed in “Appendix A.1.” However, the distinction between the one-shot and asymptotic regimes becomes apparent in the presence of large bulk entropy, where the bulk entropy, and hence R, is large. Such situations arise when studying, for example, the black hole information paradox. In this case, where we assume we have a bulk state $\phi $, the minimal surface prescription in Eq. (A.3) is replaced by quantum extremal surface prescription:

$$\begin{aligned} H(\rho _A) = \min \text {ext}_{\gamma _A} \left\{ \frac{\left|\gamma _A\right|}{4 G_N} + H(\Gamma _A)_{\phi } \right\} , \end{aligned}$$

where we minimize over extremal surfaces $\gamma _A$, and we minimize the joint contribution of the area of $\gamma _A$ and the bulk entropy contained in the associated entanglement wedge $\Gamma _A$. This formula has a natural tensor network interpretation: Consider a random tensor network state with background state $\phi = \phi _{V^{(b)}R} \otimes \phi _{V^{(l)}}$, where $\phi _{V^{(b)}R}$ is a general background state (accounting for bulk entropy) and $\phi _{V^{(l)}}$ is a tensor product of link states on a graph $G = (V,E)$. Let us take maximally entangled link states with dimension D. Then, for some cut $\Gamma _A$ with edge set $\gamma _A$, we have

$$\begin{aligned} H(\phi _{\Gamma _A}) = \log (D)\left|\gamma _A\right| + H(\phi ^{(b)}_{\Gamma _A}), \end{aligned}$$

and we may hope that minimization over this quantity along the cuts gives a good approximation to the entropy. Whether such a prescription is valid depends on the structure of the background state $\phi $. A proposal put forth in [6] is that the surface $\gamma _A$ with entanglement wedge $\Gamma _A$ gives the $\max $-entanglement wedge, if $\Gamma _A$ is the largest region such that, for any other surface $\delta _A$ homologous to A, with $\Delta _A$ the region enclosed by A and $\delta _A$, and where $\Delta _A$ is contained in $\Gamma _A$, it holds that

$$\begin{aligned} H_{\min }^{\varepsilon }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c R) \gg \frac{\left|\gamma _A\right| - \left|\delta _A\right|}{4G_N}. \end{aligned}$$

In this case, $\Gamma _A$ should be the largest region which can be (approximately) reconstructed from A. Again, one can think of a random tensor network where the background state is a tensor product of a bulk state and a maximally entangled link state of dimension D with $\log (D) = \Theta (G_N^{-1})$ along the discretization of the space. Then, this condition is (apart from the simultaneous smoothing problem) equivalent to Eq. (4.13). Enforcing complementary reconstruction, in which $\Gamma _A$ is the $\max $-entanglement wedge for A, and its complement $\Gamma _A^c$ is the $\max $-entanglement wedge for ${\bar{A}}$, allows us to interpret the holographic encoding of the bulk state into the boundary as a version of one-shot split transfer. See [6] for a detailed discussion of this proposal for holographic systems.

Joint Smoothing of Link States

In this section, we provide a proof of Lemma 4.5. In our application of one-shot entropy estimates in Sects. 4.3 and 4.4, we needed to jointly smooth over different subsystems for the background state $\phi $. For general background states, joint smoothing is an open problem [25, 31]; however, if the background state is a product of link states as in Eq. (2.2), we can straightforwardly do so. In order to prove Lemma 4.5, we first need some intermediate lemmas. These lemmas involve quantum states which can be expanded in some preferential basis with positive coefficients. Therefore, for Lemmas C.1 to C.3 and C.5, we assume that each quantum system has a preferential basis, which we assume without loss of generality to be the standard basis.

Lemma C.1

Suppose that $\phi \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(X S T)$ can be written as

$$\begin{aligned} \phi _{XST} = \sum _{i} \mathinner {|i\rangle \langle i|} \otimes \phi _{ST,i} \end{aligned}$$

where each $\phi _{ST,i}$ is a pure state such that $\vert \phi _{ST,i}\rangle $ can be written with positive coefficients in the standard basis. If

$$\begin{aligned} {{\tilde{\phi }}}_{XST} = \sum _{i} \mathinner {|i\rangle \langle i|} \otimes {{\tilde{\phi }}}_{ST,i} \end{aligned}$$

is such that each ${{\tilde{\phi }}}_{ST,i}$ is a pure state satisfying $\vert {{\tilde{\phi }}}_{ST,i}\rangle \le \vert \phi _{ST,i}\rangle $ elementwise, then

$$\begin{aligned} H_{\max }(XS \vert T)_{{{\tilde{\phi }}}} \le H_{\max }(XS \vert T)_{\phi }. \end{aligned}$$

Proof

We start by arguing that for the max-entropy

$$\begin{aligned} H_{\max }(XS \vert T)_{{{\tilde{\phi }}}} = \max _{\sigma _T \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(T)} \log F({{\tilde{\phi }}}_{XST}, I_{XS} \otimes \sigma _{T})^2 \end{aligned}$$

(C.1)

we may choose $\sigma _T$ in the optimization problem with nonnegative matrix elements in the standard basis. Indeed, suppose that $\sigma _T$ realizes the maximum in Eq. (C.1). Then, we can write a spectral decomposition $\sigma _T = \sum _{j} p_j \mathinner {|e_j\rangle \langle e_j|}$. Now let $\vert {{\tilde{e}}}_j\rangle = \sum _k \left|\langle e_j | k\rangle \right| \vert k\rangle $. Then, ${{\tilde{\sigma }}}_T = \sum _{j} p_j \mathinner {|{{\tilde{e}}}_j\rangle \langle {{\tilde{e}}}_j|} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(T)$ has nonnegative matrix elements in the standard basis. Moreover,

$$\begin{aligned} F({{\tilde{\phi }}}_{XST}, I_{XS} \otimes \sigma _{T})^{2}&= \sum _i F({{\tilde{\phi }}}_{ST,i}, I_{S} \otimes \sigma _{T})^{2} \\&\le \sum _{i,j} p_j \left|\langle {{\tilde{\phi }}}_{ST,i}\vert I_S \otimes \mathinner {|e_j\rangle \langle e_j|}\right| \vert {{\tilde{\phi }}}_{ST,i}\rangle \\&\le \sum _{i,j} p_j \langle {{\tilde{\phi }}}_{ST,i}\vert I_S \otimes \mathinner {|{{\tilde{e}}}_j\rangle \langle {{\tilde{e}}}_j|} \vert {{\tilde{\phi }}}_{ST,i}\rangle \\&= F({{\tilde{\phi }}}_{XST}, I_{XS} \otimes {{\tilde{\sigma }}}_{T})^{2}. \end{aligned}$$

Let $\sigma _T$ be optimal for Eq. (C.1) with nonnegative matrix elements. It is clear that if $\vert {{\tilde{\phi }}}_{ST,i}\rangle \le \vert \phi _{ST,i}\rangle $, then

$$\begin{aligned} F({{\tilde{\phi }}}_{XST}, I_{XS} \otimes \sigma _{T})^{2}&= \sum _i \langle {{\tilde{\phi }}}_{ST,i}\vert I_{S} \otimes \sigma _{T} \vert {{\tilde{\phi }}}_{ST,i}\rangle \\&\le \sum _i \langle \phi _{ST,i}\vert I_{S} \otimes \sigma _{T} \vert \phi _{ST,i}\rangle \\&= F(\phi _{XST}, I_{XS} \otimes \sigma _{T})^{2} \end{aligned}$$

and hence

$$\begin{aligned} H_{\max }(XS \vert T)_{{{\tilde{\phi }}}}&= \log F({{\tilde{\phi }}}_{XST}, I_{XS} \otimes \sigma _{T})^2 \le \log F(\phi _{XST}, I_{XS} \otimes \sigma _{T})^2 \\ {}&\quad \le H_{\max }(XS \vert T)_{\phi }. \end{aligned}$$

$\square $

Lemma C.2

Suppose that $\phi \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(XYST)$ can be written as

$$\begin{aligned} \phi _{XYST} = \sum _{i} \mathinner {|i\rangle \langle i|} \otimes \mathinner {|j\rangle \langle j|} \otimes \phi _{ST,ij} \end{aligned}$$

where each $\phi _{ST,ij}$ is a pure state such that $\vert \phi _{ST,ij}\rangle $ can be written with positive coefficients in the standard basis. If

$$\begin{aligned} {{\tilde{\phi }}}_{XYST} = \sum _{i} \mathinner {|i\rangle \langle i|} \otimes \mathinner {|j\rangle \langle j|} \otimes {{\tilde{\phi }}}_{ST,ij} \end{aligned}$$

is such that each ${{\tilde{\phi }}}_{ST,ij}$ is a pure state satisfying $\vert {{\tilde{\phi }}}_{ST,ij}\rangle \le \vert \phi _{ST,ij}\rangle $ elementwise, then

$$\begin{aligned} H_{\max }(XS \vert YT)_{{{\tilde{\phi }}}} \le H_{\max }(XS \vert YT)_{\phi }. \end{aligned}$$

Proof

If we have a state $\rho \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(XYST)$ which is of the form

$$\begin{aligned} \sum _{j} \mathinner {|j\rangle \langle j|} \otimes \rho _j \end{aligned}$$

where each $\rho _j \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(XST)$, then (Proposition 4.6 in [73])

$$\begin{aligned} H_{\max }(XS \vert YT) = \log \left(\sum _{j} 2^{H_{\max }(XS \vert T)_{\rho _j}} \right). \end{aligned}$$

Together with Lemma C.1, this implies the result. $\square $

The background states we consider are tensor products of link states, and they can be expressed with positive coefficients in a product basis along the half-edges. The following lemma will help us to show that when we perform smoothing we can retain some of this structure.

Lemma C.3

Suppose that $\phi \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(X S T)$ can be written as

$$\begin{aligned} \phi _{X S T} = \sum _{i} \mathinner {|i\rangle \langle i|} \otimes \phi _{S T, i} \end{aligned}$$

where each $\phi _{S T, i}$ is pure and has a Schmidt decomposition

$$\begin{aligned} \vert \phi _{S T,i}\rangle = \sum _j \sqrt{\lambda _{i,j}} \vert jj\rangle \end{aligned}$$

in the standard basis. Then,

$$\begin{aligned} H_{\min }^{\varepsilon }(X S \vert T)_\phi = H_{\min }(X S \vert T)_{{{\tilde{\phi }}}} \end{aligned}$$

for a state ${{\tilde{\phi }}} \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(XST)$ which is such that $P(\phi , {{\tilde{\phi }}}) \le \varepsilon $ and

$$\begin{aligned} {{\tilde{\phi }}}_{X S T} = \sum _{i} \mathinner {|i\rangle \langle i|} \otimes {{\tilde{\phi }}}_{S T,i} \end{aligned}$$

with

$$\begin{aligned} \vert {{\tilde{\phi }}}_{S T,i}\rangle = \sum _j \sqrt{{{\tilde{\lambda }}}_{i,j}} \vert jj\rangle . \end{aligned}$$

for some $0 \le {{\tilde{\lambda }}}_{i,j} \le 1$.

Proof

Let Y be a copy of X and let R be an additional reference system of sufficiently large dimension. Then, we let

$$\begin{aligned} \vert \phi _{XYSTR}\rangle = \sum _{i,j} \sqrt{\lambda _{i,j}} \vert ii\rangle \otimes \vert jj\rangle \otimes \vert 0\rangle \end{aligned}$$

be a purification of $\phi _{XST}$. Now, by duality of smooth entropies

$$\begin{aligned} H_{\min }^{\varepsilon }(X S \vert T)_\phi = - H_{\max }^{\varepsilon }(X S \vert Y)_\phi \end{aligned}$$

By Lemma 6.13 of [74], we can find a state

$$\begin{aligned} {\bar{\phi }}_{S XY} = \sum _{j} \mathinner {|j\rangle \langle j|} \otimes {\bar{\phi }}_{XY,j} \end{aligned}$$

such that $P(\phi , {{\bar{\phi }}}) \le \varepsilon $ and $H_{\max }^{\varepsilon }(X S \vert Y)_\phi = H_{\max }(X S \vert Y)_{{{\bar{\phi }}}}$. Now, an arbitrary purification of ${\bar{\phi }}_{S XY}$ will be of the form

$$\begin{aligned} \vert {{\bar{\phi }}}_{STXYR}\rangle = \sum _{j,k} \vert jk\rangle \otimes \vert {{\bar{\phi }}}_{XYR,jk}\rangle . \end{aligned}$$

We see that

$$\begin{aligned} \left|\langle \phi | {{\bar{\phi }}}\rangle \right| = \sum _{i,j,k} \delta _{j,k} \sqrt{\lambda _{i,j}}\left|\left( \langle ii\vert \otimes \langle 0\vert \right) \vert \phi _{XYR,jk}\rangle \right| \end{aligned}$$

so, optimizing over the choice of purification $\vert {{\bar{\phi }}}\rangle $, by Uhlmann’s theorem we find a purification of the form

$$\begin{aligned} \vert {{\bar{\phi }}}_{STXYR}\rangle = \sum _{j} \vert jj\rangle \otimes \vert {{\bar{\phi }}}_{XYR,j}\rangle \end{aligned}$$

such that $F(\phi _{XYS}, {{\bar{\phi }}}_{XYS}) = \left|\langle \phi | {{\bar{\phi }}}\rangle \right|$. We define the projector $\Pi _{XY} = \sum _{j} \mathinner {|jj\rangle \langle jj|}$. Then for any $\sigma _Y \in {\mathcal {P}}_{\scriptscriptstyle {\le }}(Y)$ we note that $(I_S \otimes \Pi _{XY}) I_{XS} \otimes \sigma _Y (I_S \otimes \Pi _{XY}) \le I_{XS} \otimes \sigma _{Y}'$ where $\sigma _{Y}' = \sum _j \langle j\vert \sigma _Y \vert j\rangle \mathinner {|j\rangle \langle j|}$ and hence

$$\begin{aligned}&F((I_S \otimes \Pi _{XY} ){{\bar{\phi }}} (I_S \otimes \Pi _{XY} ), I_{XS} \otimes \sigma _Y) \\&\quad = F({{\bar{\phi }}}, (I_S \otimes \Pi _{XY} ) I_{XS} \otimes \sigma _Y (I_S \otimes \Pi _{XY} )) \le F({\bar{\phi }}, I_{XS} \otimes \sigma _{Y}'). \end{aligned}$$

If we let $\vert \phi _{XYSTR}'\rangle = (\Pi _{XY} \otimes I_{STR})\vert {{\bar{\phi }}}_{XYSTR}\rangle $ this implies that

$$\begin{aligned} H_{\max }(XS \vert Y)_{\phi '} \le H_{\max }(XS \vert Y)_{{{\bar{\phi }}}}. \end{aligned}$$

Moreover, it is easy to see that $\left|\langle \phi ' | \phi \rangle \right| = \left|\langle {{\bar{\phi }}} | \phi \rangle \right|$. Again using duality, we find that

$$\begin{aligned} H_{\min }^{\varepsilon }(X S \vert T)_\phi \le H_{\min }(X S \vert T R)_{\phi '}. \end{aligned}$$

Finally, let

$$\begin{aligned} \vert {{\tilde{\phi }}}\rangle = (I_{XYST} \otimes \langle 0\vert ) \vert \phi '\rangle \end{aligned}$$

then by data processing (Theorem 5.7 of [73], note that data processing for the conditional min-entropy is valid for trace non-increasing completely positive maps) it holds that

$$\begin{aligned} H_{\min }(X S \vert T R)_{\phi '} \le H_{\min }(X S \vert T)_{{{\tilde{\phi }}}}. \end{aligned}$$

By construction $\left|\langle {{\tilde{\phi }}} | \phi \rangle \right| = \left|\langle \phi ' | \phi \rangle \right|$ and hence $P({{\tilde{\phi }}},\phi ) \le \varepsilon $. Finally, by construction ${{\tilde{\phi }}}$ is of the desired form

$$\begin{aligned} \vert {{\tilde{\phi }}}_{STXY}\rangle = \sum _{i,j} \sqrt{{{\tilde{\lambda }}}_{i,j}}\vert ii\rangle \otimes \vert jj\rangle . \end{aligned}$$

$\square $

The following lemmas will be used in our joint smoothing construction to bound the purified distance of the smoothed state.

Lemma C.4

Suppose $\phi , \psi \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$ are pure states, and suppose

$$\begin{aligned} \vert \phi \rangle = \sum _i \sqrt{\lambda _i} \vert i\rangle \qquad \vert \psi \rangle = \sum _i \sqrt{\mu _i} \vert i\rangle \end{aligned}$$

for some $\lambda _i, \mu _{i} \ge 0$. Then,

$$\begin{aligned} T(\phi , \psi )&\le P(\phi ,\psi ) \le \sqrt{2\sum _i \left|\lambda _i - \mu _i\right|}. \end{aligned}$$

Proof

Let

$$\begin{aligned} \rho = \Delta (\phi ), \qquad \sigma = \Delta (\psi ) \end{aligned}$$

where $\Delta $ is the completely dephasing channel, so

$$\begin{aligned} \rho = \sum _i \lambda _i \mathinner {|i\rangle \langle i|} \qquad \sigma = \sum _i \mu _i \mathinner {|i\rangle \langle i|}. \end{aligned}$$

Then, since $\rho $ and $\sigma $ are diagonal in the same basis

$$\begin{aligned} P(\phi , \psi ) = P(\rho , \sigma ), \end{aligned}$$

we then estimate the trace distance by

$$\begin{aligned} T(\rho , \sigma ) \le \Vert \rho - \sigma \Vert _1 = \sum _i \left|\lambda _i - \mu _i\right| \end{aligned}$$

and we apply the Fuchs–van de Graaff inequalities, Eq. (4.1), to estimate

$$\begin{aligned} T(\phi , \psi ) \le P(\phi ,\psi ) = P(\rho , \sigma ) \le \sqrt{2T(\rho , \sigma )} \le \sqrt{2\sum _i \left|\lambda _i - \mu _i\right|}. \end{aligned}$$

$\square $

Lemma C.5

Suppose $\phi , \phi _j \in {\mathcal {P}}_{\scriptscriptstyle {\le }}({\mathcal {H}})$ are pure states for $j \in [n]$, and suppose

$$\begin{aligned} \vert \phi \rangle = \sum _i \sqrt{\lambda _i} \vert i\rangle \qquad \vert \phi _j\rangle = \sum _i \sqrt{\lambda _{j,i}} \vert i\rangle \end{aligned}$$

for some $\lambda , \lambda _{i,j} \ge 0$ and suppose $P(\phi , \phi _j) \le \varepsilon $ for all j. Then, if we let

$$\begin{aligned} \vert {{\tilde{\phi }}}\rangle = \sum _i \min _j \sqrt{\lambda _{j,i}} \vert i\rangle \end{aligned}$$

it holds that $P(\phi , {{\tilde{\phi }}}) \le 2\sqrt{n\varepsilon }$.

Proof

Let

$$\begin{aligned} \rho = \sum _i \lambda _i \mathinner {|i\rangle \langle i|} \qquad \rho _j = \sum _i \lambda _{j,i} \mathinner {|i\rangle \langle i|}. \end{aligned}$$

Then, $P(\rho ,\rho _j) = P(\phi , \phi _j)$ and using the Fuchs–van de Graaff inequalities

$$\begin{aligned} \sum _i \left|\lambda _i - \min _j \lambda _{j,i}\right|&\le \sum _j \sum _i \left|\lambda _i - \lambda _{j,i}\right| = \sum _j \Vert \rho - \rho _j\Vert _1 \\&\le 2\sum _{j} T(\rho , \rho _j) \le 2\sum _j P(\phi , \phi _j) \le 2n\varepsilon . \end{aligned}$$

The result now follows from Lemma C.4. $\square $

Finally, we prove our main joint smoothing result. We again consider the setting of a tensor product of link states on a graph $G = (V,E)$, where we consider some boundary subsystem $A \subseteq V_\partial $. Recall that for a cut $\Gamma _A \in C(A)$ we define

Lemma 4.5

$$\begin{aligned} P(\phi ,\phi ^\varepsilon ) \le 2\left( \sqrt{\left|{\mathcal {C}}_1(\Gamma _A)\right|} + \sqrt{\left|{\mathcal {C}}_2(\Gamma _A)\right|}\right) \sqrt{\varepsilon } \end{aligned}$$

and it holds that for any $\Delta _A \in {\mathcal {C}}_1(\Gamma _A)$

and for any $\Delta _A \in {\mathcal {C}}_2(\Gamma _A)$

Proof

We may assume without loss of generality that $\phi = \bigotimes _{e \in E} \phi _e$ is such that each $\phi _e$ has Schmidt decomposition in the standard basis,

$$\begin{aligned} \vert \phi _e\rangle = \sum _{i = 1}^{D_e} \sqrt{\lambda _{e,i}} \vert i i\rangle \end{aligned}$$

This means we may write

$$\begin{aligned} \vert \phi \rangle = \sum _I \sqrt{\lambda _I} \vert I\rangle \end{aligned}$$

where I runs over all possible basis elements along each edge $I = \{i_e\}_{e \in E}$ and

$$\begin{aligned} \lambda _I = \prod _{e \in E} \lambda _{e,i_e} \qquad \vert I\rangle = \bigotimes _{e \in E} \vert i_e i_e\rangle . \end{aligned}$$

We also let

$$\begin{aligned} E_1&= \{e = (xy) \in E, x,y \in \Gamma _A\}\\ E_2&= \{e = (xy) \in E, x,y \in \Gamma _A^c\}. \end{aligned}$$

so $E = E_1 \sqcup E_2 \sqcup \gamma _A$.

Consider a cut $\Delta _A \in {\mathcal {C}}_1(\Gamma _A)$. Let

$$\begin{aligned} X&= \{(e,x) : e = (xy), x \in \Gamma _A {\setminus } \Delta _A, y \in \Delta _A \}\\ S&= \{(e,x) : e = (xy), x \in \Gamma _A {\setminus } \Delta _A, y \in \Delta _A^c \}\\ T&= \{(e,x) : e = (xy), x \in \Gamma _A^c, y \in \Delta _A^c \}\\ \end{aligned}$$

Then, we see that $H_{\min }^{\varepsilon }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi } = H_{\min }^{\varepsilon }(XS \vert T)_{\phi }$ and we can write (as $\phi $ is a product state)

$$\begin{aligned} \phi _{XST} = \phi _X \otimes \phi _{ST}. \end{aligned}$$

Here, $\phi _{ST}$ is pure, whereas $\phi _X$ is diagonal in the standard basis along the half-edges in X. From Lemma C.3 (applied with $\phi _{ST,i} = \phi _{ST}$ for each i), we find that we obtain a state $\phi ^{\Delta _A, \varepsilon }$ which is such that $P(\phi ^{\Delta _A, \varepsilon }, \phi ) \le \varepsilon $ and

$$\begin{aligned} H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi ^{\Delta _A, \varepsilon }} \ge H_{\min }^{\varepsilon }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi } \end{aligned}$$

and which can be written as

$$\begin{aligned} \vert \phi ^{\Delta _A, \varepsilon }\rangle = \sum _I \sqrt{\lambda _I^{\Delta _A, \varepsilon }} \vert I\rangle . \end{aligned}$$

for some coefficients $\lambda _I^{\Delta _A, \varepsilon }$. Moreover, we can write

$$\begin{aligned} \vert \phi ^{\Delta _A, \varepsilon }\rangle = \sum _i \vert ii\rangle \otimes \vert \phi _{E_1, i}^{\Delta _A, \varepsilon }\rangle \otimes \vert \phi _{E_2}\rangle \end{aligned}$$

where i runs over all possible basis elements along the cut $\gamma _A$, $i = \{i_e\}_{e \in \gamma _A}$ and $\vert i\rangle = \bigotimes _{e \in \gamma _A} \vert i_e\rangle $, and the $\phi _{E_1, i}^{\Delta _A, \varepsilon }$ are pure states with positive coefficients in the standard basis. We now let

$$\begin{aligned} \vert \phi _1^{\varepsilon }\rangle&= \sum _{I} \sqrt{\lambda _I^\varepsilon } \vert I\rangle \\ \lambda _I^\varepsilon&= \min _{\Delta _A \in {\mathcal {C}}_1(\Gamma _A)} \lambda _I^{\Delta _A, \varepsilon }. \end{aligned}$$

By construction, this state can be written as

$$\begin{aligned} \vert \phi _1^{\varepsilon }\rangle = \sum _i \vert ii\rangle \otimes \vert \phi _{E_1, i}^{\varepsilon }\rangle \otimes \vert \phi _{E_2}\rangle \end{aligned}$$

By Lemma C.5, it holds that

$$\begin{aligned} P(\phi , \phi _1^{\varepsilon }) \le 2\sqrt{\left|{\mathcal {C}}_1(\Gamma _A)\right|\varepsilon }. \end{aligned}$$

An analogous construction can be used to construct $\phi ^{\Delta _A, \varepsilon }$ for $\Delta _A \in {\mathcal {C}}_2(\Gamma _A)$, and taking the minimum over all such cuts, we get a state $\phi _2^{\varepsilon }$ of the form

$$\begin{aligned} \vert \phi _2^{\varepsilon }\rangle = \sum _i \vert ii\rangle \otimes \vert \phi _{E_1}\rangle \otimes \vert \phi _{E_2, i}^{\varepsilon }\rangle \end{aligned}$$

which satisfies

$$\begin{aligned} P(\phi , \phi _2^{\varepsilon }) \le 2\sqrt{\left|{\mathcal {C}}_2(\Gamma _A)\right|\varepsilon }. \end{aligned}$$

We now define

$$\begin{aligned} \vert \phi ^{\varepsilon }\rangle = \sum _i \vert ii\rangle \otimes \vert \phi _{E_1, i}^{\varepsilon }\rangle \otimes \vert \phi _{E_2, i}^{\varepsilon }\rangle . \end{aligned}$$

We will now show that $\phi ^{\varepsilon }$ has the desired properties. First of all, since $\phi $ is normalized, $F_*(\phi , \phi ^\varepsilon ) = F(\phi , \phi ^\varepsilon )$ and $F(\phi , \phi ^\varepsilon ) = F(\phi _1^\varepsilon , \phi _2^\varepsilon )$ so

$$\begin{aligned} P(\phi , \phi ^\varepsilon ) = P(\phi _1^\varepsilon , \phi _2^\varepsilon ) \le P(\phi ,\phi _1^\varepsilon ) + P(\phi , \phi _2^\varepsilon ) \le 2\left( \sqrt{\left|{\mathcal {C}}_1(\Gamma _A)\right|} + \sqrt{\left|{\mathcal {C}}_2(\Gamma _A)\right|}\right) \sqrt{\varepsilon } \end{aligned}$$

using the fact that the purified distance is a metric.

Next, consider $\Delta _A \in C_1(A)$. We note that by duality $H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi ^{\varepsilon }} = - H_{\max }(\Gamma _A {\setminus } \Delta _A \vert \Delta _A)_{\phi ^{\varepsilon }}$. We define the following subsystems of half-edges,

$$\begin{aligned} X'&= \{(e,x) : e = (xy), x \in \Gamma _A {\setminus } \Delta _A, y \in \Gamma _A^c\} \\ Y'&= \{(e,x) : e = (xy), x \in \Delta _A, y \in \Gamma _A^c \} \\ S'&= \{(e,x) : e = (xy), x \in \Gamma _A {\setminus } \Delta _A, y \in \Gamma _A \} \\ T'&= \{(e,x) : e = (xy), x \in \Delta _A, y \in \Gamma _A \}. \end{aligned}$$

so $H_{\max }(\Gamma _A {\setminus } \Delta _A \vert \Delta _A)_{\phi ^{\varepsilon }} = H_{\max }(X'S' \vert Y'T')_{\phi ^{\varepsilon }}$. Next, we observe that by construction, $\phi ^{\varepsilon }$ is of the form

$$\begin{aligned} \phi ^{\varepsilon }_{X'Y'S'T'} = \sum _{i,j} \mathinner {|i\rangle \langle i|} \otimes \mathinner {|j\rangle \langle j|} \otimes \phi ^{\varepsilon }_{S'T',ij} \end{aligned}$$

where i runs over all possible basis elements along $\gamma _A {\setminus } \delta _A$, $i = \{i_e\}_{e \in \gamma _A{\setminus } \delta _A}$ and $\vert i\rangle = \bigotimes _{e \in \gamma _A {\setminus } \delta _A} \vert i_e\rangle $, which forms a basis for ${\mathcal {H}}_X$; and j runs over all possible basis elements along $\gamma _A \cap \delta _A$, $j = \{j_e\}_{e \in \gamma _A\cap \delta _A}$ and $\vert j\rangle = \bigotimes _{e \in \gamma _A \cap \delta _A} \vert j_e\rangle $, which forms a basis for ${\mathcal {H}}_Y$. Each $\phi ^{\varepsilon }_{S'T',ij}$ is a pure state. Similarly,

$$\begin{aligned} \phi ^{\Delta _A,\varepsilon }_{X'Y'S'T'} = \sum _{i,j} \mathinner {|i\rangle \langle i|} \otimes \mathinner {|j\rangle \langle j|} \otimes \phi ^{\Delta _A,\varepsilon }_{S'T',ij} \end{aligned}$$

where the $\phi ^{\Delta _A,\varepsilon }_{S'T',ij}$ are pure states. Moreover, by construction $\vert \phi ^{\varepsilon }_{S'T',ij}\rangle \le \vert \phi ^{\Delta _A,\varepsilon }_{S'T',ij}\rangle $ elementwise in the standard basis. Therefore, by Lemma C.2 it holds that

$$\begin{aligned} H_{\max }(X'S' \vert Y'T')_{\phi ^{\varepsilon }} \le H_{\max }(X'S' \vert Y'T')_{\phi ^{\Delta _A,\varepsilon }}. \end{aligned}$$

Finally, by duality $H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi ^{\Delta _A,\varepsilon }} = -H_{\max }(X'S' \vert Y'T')_{\phi ^{\Delta _A,\varepsilon }}$ and we conclude

$$\begin{aligned} H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi ^{\varepsilon }} \ge H_{\min }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi ^{\Delta _A,\varepsilon }} \ge H_{\min }^{\varepsilon }(\Gamma _A {\setminus } \Delta _A \vert \Gamma _A^c)_{\phi }. \end{aligned}$$

A similar argument is valid for $\Delta _A \in {\mathcal {C}}_2(\Gamma _A)$, showing that

$$\begin{aligned} H_{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi ^{\varepsilon }} \ge H_{\min }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi ^{\Delta _A,\varepsilon }} \ge H_{\min }^{\varepsilon }(\Delta _A {\setminus } \Gamma _A \vert \Gamma _A)_{\phi }. \end{aligned}$$

$\square $

In the proof of Theorem 4.12, we need a similar smoothing lemma for a slightly different situation as stated in Lemma 4.11.

Recall the setup: We consider a graph $G = (V,E)$ and we have a set of boundary vertices $V_\partial $, and $A \subseteq V_\partial $. We denote by $\gamma _{A,1}$ the edges incident to A, and $\gamma _{A,2}$ the edges incident to ${{\bar{A}}}$, and we assume these sets do not intersect. We let $E_b = E {\setminus } (\gamma _{A,1} \cup \gamma _{A,2})$. For a cut $\Delta _A \in C(A)$, we let $Y^{\Delta _A}$ be the set of half-edges

$$\begin{aligned} Y^{\Delta _A} = \{(e,x) : e = (xy), x \in \Delta _A^c, y \in A \}. \end{aligned}$$

Lemma 4.11

$$\begin{aligned} \phi _{E_b}&= \bigotimes _{e \in E_b} \phi _e \end{aligned}$$

is a product state and the $\phi _{\gamma _{A,i}}$ have a Schmidt decomposition in the standard basis along the half-edges. Then, there exists a state $\phi ^{\varepsilon }$ which is such that

$$\begin{aligned} P(\phi ,\phi ^{\varepsilon }) \le 2\sqrt{2^{V_b}\varepsilon } \end{aligned}$$

and for all cuts $\Delta _A \in C(A)$ it holds that

$$\begin{aligned} H_{\min }(\Delta _A {\setminus } A \vert A)_{\phi ^\varepsilon } \ge H_{\min }^{\varepsilon }(\Delta _A {\setminus } A \vert A Y^{\Delta _A})_{\phi }. \end{aligned}$$

Proof

The argument is much the same as the proof of Lemma 4.5. We again assume without loss of generality that each $\phi _e$ has Schmidt decomposition in the standard basis, and we may write

$$\begin{aligned} \vert \phi \rangle = \sum _I \sqrt{\lambda _I} \vert I\rangle \end{aligned}$$

as in the proof of Lemma 4.5. Consider a cut $\Delta _A \in C(A)$. Then, we let

$$\begin{aligned} X&= \{(e,x) : e = (xy), x \in \Delta _A, y \in \Delta _A^c \}\\ S&= \{(e,x) : e = (xy), x \in \Delta _A, y \in \Delta _A {\setminus } A\}\\ T&= \{(e,x) : e = (xy), x \in A, y \notin A \} \cup Y^{\Delta _A} \end{aligned}$$

so $H^{\varepsilon }_{\min }(\Delta _A {\setminus } A \vert A Y^{\Delta _A})_{\phi } = H^{\varepsilon }_{\min }(XS \vert T)_{\phi }$. Moreover, by the structure of the state, $\phi _{XST} = \phi _X \otimes \phi _{ST}$, where $\phi _{ST}$ is pure and $\phi _X$ is diagonal in the standard basis. So, from Lemma C.3 we obtain a state

$$\begin{aligned} \vert \phi ^{\Delta _A, \varepsilon }\rangle = \sum _I \sqrt{\lambda _I^{\Delta _A, \varepsilon }} \vert I\rangle \end{aligned}$$

for some real coefficients $\lambda _I^{\Delta _A, \varepsilon } \ge 0$ which is such that $P(\phi , \phi ^{\Delta _A, \varepsilon }) \le \varepsilon $ and $H_{\min }(XS \vert T)_{\phi ^{\varepsilon }} \ge H^{\varepsilon }_{\min }(XS \vert T)_{\phi }$. As before, we define

$$\begin{aligned} \vert \phi ^{\varepsilon }\rangle&= \sum _I \sqrt{\lambda _{I}^\varepsilon } \vert I\rangle \\ \lambda _I^\varepsilon&= \min _{\Delta _A \in C(A)} \lambda _I^{\Delta _A, \varepsilon }. \end{aligned}$$

By Lemma C.5$P(\phi , \phi ^\varepsilon ) \le 2\sqrt{\left|C(A)\right|\varepsilon } = 2\sqrt{2^{\left|V_b\right|}\varepsilon }$. If we define the following subsystems

$$\begin{aligned} X'&= \{(e,x) : e = (xy), x \in \Delta _A, y A \}\\ Y'&= \{(e,x) : e = (xy), x \in \Delta _A^c, y A \}\\ S'&= \{(e,x) : e = (xy), x \in \Delta _A, y \notin A\}\\ T'&= \{(e,x) : e = (xy), x \in \Delta _A^c, y \notin A \} \end{aligned}$$

then $\phi ^{\varepsilon }_{X'S'T'}$ can be written as

$$\begin{aligned} \phi ^{\varepsilon }_{X'Y'S'T'} = \sum _{i, j} \mathinner {|i\rangle \langle i|} \otimes \mathinner {|j\rangle \langle j|} \otimes \phi ^{\varepsilon }_{S'T',ij} \end{aligned}$$

where i runs over all possible basis elements along $\gamma _{A,1} {\setminus } \delta _A$, $i = \{i_e\}_{e \in \gamma _{A,1}{\setminus } \delta _A}$ and $\vert i\rangle = \bigotimes _{e \in \gamma _{A,1} {\setminus } \delta _A} \vert i_e\rangle $, which forms a basis for ${\mathcal {H}}_{X'}$; and j runs over all possible basis elements along $\gamma _{A,1} \cap \delta _A$, $j = \{j_e\}_{e \in \gamma _{A,1}\cap \delta _A}$ and $\vert j\rangle = \bigotimes _{e \in \gamma _{A,1} \cap \delta _A} \vert j_e\rangle $, which forms a basis for ${\mathcal {H}}_{Y'}$. The $\phi ^{\varepsilon }_{S'T',ij}$ are pure states. We can therefore apply Lemma C.1, and

$$\begin{aligned} H_{\max }(X'S' \vert T'Y')_{\phi ^{\varepsilon }} \le H_{\max }(X'S' \vert T'Y')_{\phi ^{\Delta _A,\varepsilon }} \end{aligned}$$

and hence, by applying duality and data processing

$$\begin{aligned} H_{\min }(\Delta _A {\setminus } A \vert A)_{\phi ^\varepsilon }&= - H_{\max }(\Delta _A {\setminus } A \vert T'Y')_{\phi ^{\varepsilon }} = - H_{\max }(X'S' \vert T'Y')_{\phi ^{\varepsilon }}\\&\ge H_{\max }(X'S' \vert T'Y')_{\phi ^{\Delta _A, \varepsilon }} = H_{\min }(\Delta _A {\setminus } A \vert A)_{\phi ^{\Delta _A, \varepsilon }}\\&\ge H_{\min }(\Delta _A {\setminus } A \vert A Y^{\Delta _A})_{\phi ^{\Delta _A, \varepsilon }} \ge H^{\varepsilon }_{\min }(\Delta _A {\setminus } A \vert A Y^{\Delta _A})_{\phi }. \end{aligned}$$

$\square $

Remark C.6

We may make two observations on Lemmas 4.5 and 4.11. In both cases, if we construct the coefficients as $\min \lambda _{I}^{\varepsilon }, \lambda _I$ we find that we may assume the resulting state has coefficients in the basis $\vert I\rangle $ which are upper bounded by $\lambda _I$ (affecting only the constant factor in the upper bound for $P(\phi , \phi ^{\varepsilon })$). Secondly, we note that

$$\begin{aligned} \left|{{\,\textrm{tr}\,}}[\phi ] - {{\,\textrm{tr}\,}}[\phi ^{\varepsilon }]\right| \le \sum _{I}\, \left|\lambda _I - \lambda _I^{\varepsilon }\right| = {\mathcal {O}}(\varepsilon ) \end{aligned}$$

(C.2)

rather than the naive ${\mathcal {O}}(\sqrt{\varepsilon })$ estimate.

A Family of Metrics on the Symmetric Group

In analogy to the Cayley distance on the symmetric group $S_k$, we consider the following function $d_{\rho } :S_k \times S_k \rightarrow \mathbbm {R}_{\ge 0}$ given some arbitrary density matrix $\rho \in {\mathcal {P}}_{\scriptscriptstyle {=}}({\mathcal {H}})$:

$$\begin{aligned} d_{\rho }(\pi _1,\pi _2)&= \sum _{l \in C(\pi _1^{-1} \pi _2)} (l-1) H_l(\rho ) = -\sum _{l \in C(\pi _1^{-1} \pi _2)} \log {{\,\textrm{tr}\,}}[\rho ^l]. \end{aligned}$$

As we saw in Sect. 2.1, this function is closely related to the replica trick for random tensor networks with non-trivial link states. In the case where $\rho $ is a maximally mixed state, this function is precisely proportional to the Cayley distance (2.15). In this appendix, we show that $d_\rho $ is a metric for any non-pure quantum state $\rho $. (If it is pure, then $d_\rho $ vanished identically.)

We say that two permutations $\alpha , \beta \in S_k$ are disjoint if any point not fixed by $\alpha $ is fixed by $\beta $ and vice versa.

Lemma D.1

The function $d_\rho $ defines a metric on $S_k$ for any state $\rho $ which is not pure. Moreover, if $\rho $ does not have flat spectrum the following holds for all $\pi _1,\pi _2,\pi _3\in S_k$:

$$\begin{aligned} d_{\rho }(\pi _1,\pi _2) + d_{\rho }(\pi _2,\pi _3) = d_{\rho }(\pi _1,\pi _3) \end{aligned}$$

if and only if $\pi _1^{-1}\pi _2$ and $\pi _2^{-1}\pi _3$ are disjoint permutations.

Proof

The fact that $d_{\rho }(\pi _1, \pi _2) = 0$ if and only if $\pi _1 = \pi _2$ follows from the assumption that $\rho $ is not pure. The symmetry $d_{\rho }(\pi _1,\pi _2) = d_{\rho }(\pi _2,\pi _1)$ is clear from $C(\pi _1^{-1}\pi _2) = C(\pi _2^{-1}\pi _1)$. Thus, the only non-trivial property we have to show in order for $d_\rho $ to be a metric is the triangle inequality. We let $d = \dim ({\mathcal {H}})$ and we write ${{\,\textrm{spec}\,}}(\rho ) = \{\lambda _i\}_{i=1}^d$. Then, by letting $\alpha = \pi _1^{-1}\pi _2$ and $\beta = \pi _2^{-1}\pi _3$ the triangle inequality

$$\begin{aligned} d_{\rho }(\pi _1,\pi _2) + d_{\rho }(\pi _2,\pi _3) \ge d_{\rho }(\pi _1,\pi _3) \end{aligned}$$

is equivalent to

$$\begin{aligned} \prod _{l \in C(\alpha )}\left( \sum _{i=1}^d \lambda _i^l\right) \prod _{m \in C(\beta )} \left( \sum _{i=1}^d \lambda _i^m\right) \le \prod _{n \in C(\alpha \beta )} \left( \sum _{i=1}^d \lambda _i^n\right) . \end{aligned}$$

(D.1)

Moreover, we need to show that if the spectrum is not flat, we have equality if and only if $\alpha $ and $\beta $ are disjoint permutations. It suffices to show this for the case where $\beta $ is a cycle, as we can write $\beta $ as a product of disjoint cycles in general and iteratively apply the result for the case where $\beta $ is a cycle. We write $\beta = (i_1 \ldots i_m)$ and $\beta _q = (i_1 \ldots i_q)$ for $q \le m$ and we let $\alpha _{q} = \alpha \beta _q$ and $\alpha _0 = \alpha $.

We can then find a unique sequence of numbers $q_0 = 0< q_1< q_2< \ldots < m$ such that, if one compares $\alpha _{q_j}$ and $\alpha _{q_{j+1}}$, then either $q_{j+1} - q_j + 1$ cycles have merged into a single cycle, or a single cycle has split into $q_{j+1} - q_j + 1$ cycles. Moreover, these two operations are alternating in the sense that if $\alpha _{q_j}$ to $\alpha _{q_{j+1}}$ is a merge, then $\alpha _{q_{j+1}}$ to $\alpha _{q_{j+2}}$ is a split and vice versa. Indeed, $\beta _{q+1} = \beta _q (i_{q} \, i_{q+1})$, so $\alpha _{q+1} = \alpha _q (i_{q} \, i_{q+1})$. If $i_q$ and $i_{q+1}$ are in different cycles in $\alpha _q$, then applying $(i_q \, i_{q+1})$ merges these two cycles, and if $i_q$ and $i_{q+1}$ are in the same cycle in $\alpha _q$, then applying $(i_q \, i_{q+1})$ splits this cycle into two cycles. Let

$$\begin{aligned} J = \{j : \alpha _{q_j}\text { to }\alpha _{q_{j+1}}\text { is a merge}\} \end{aligned}$$

and for $j \in J$ let $m_j = q_{j+1} - q_j + 1$, then $\sum _{j \in J} m_j \le m$. Now, it is clear that for any collection $n_1, \ldots , n_r$ of numbers with $\sum _{j=1}^r n_j \le n$ it holds that

$$\begin{aligned} \prod _{j =1}^r\left( \sum _{i=1}^d \lambda _i^{n_j}\right) \ge \sum _{i=1}^d \lambda _i^n \end{aligned}$$

(D.2)

where the inequality is strict unless $r =1$ and $n_1 = n$ (since we assume that $\rho $ is not pure). In particular (recall that we assumed $\beta $ to be a cycle of length m), we may estimate the left-hand side of Eq. (D.1) by

$$\begin{aligned} \prod _{l \in C(\alpha )}\left( \sum _{i=1}^d \lambda _i^l\right) \left( \sum _{i=1}^d \lambda _i^m\right) \le \prod _{l \in C(\alpha )}\left( \sum _{i=1}^d \lambda _i^l\right) \prod _{j \in J} \left( \sum _{i=1}^d \lambda _i^{m_j}\right) . \end{aligned}$$

(D.3)

with equality if and only if $\{m_j\}_{j \in J} = \{m\}$. (So $\beta $ only merges cycles.) We will next argue that for $j \notin J$

$$\begin{aligned} \prod _{l \in C(\alpha _{q_j})}\left( \sum _{i=1}^d \lambda _i^l\right) \le \prod _{l \in C(\alpha _{q_{j+1}})}\left( \sum _{i=1}^d \lambda _i^l\right) . \end{aligned}$$

(D.4)

and for $j \in J$

$$\begin{aligned} \prod _{l \in C(\alpha _{q_j})}\left( \sum _{i=1}^d \lambda _i^l \right) \left( \sum _{i=1}^d \lambda _i^{m_j} \right) \le \prod _{l \in C(\alpha _{q_{j+1}})}\left( \sum _{i=1}^d \lambda _i^l\right) \end{aligned}$$

(D.5)

with equality if and only if the spectrum is flat or all the cycles that are merged are 1-cycles. Then, combining Eqs. (D.3), (D.4) and (D.5) we may conclude that Eq. (D.1) holds, with equality if and only if $\beta $ is a disjoint cycle from $\alpha $.

Equation (D.4) follows immediately from Eq. (D.2), so it remains to show Eq. (D.5). To this end we will apply Jensen’s inequality. In Eq. (D.5) let us assume that s cycles of lengths $l_1,\ldots ,l_s$ are merged into a cycle of length l, so $l = \sum _{p=1}^s l_p$. We let $l_{s+1} = s$. Let $f_p: [d] \rightarrow \mathbbm {R}_{\ge 0}$ be the function defined by $i\mapsto \lambda _{i}^{l_p - 1}$, for $p \in [s + 1]$. Moreover, let

$$\begin{aligned} \eta _p = \frac{l_p-1}{l - 1} \quad \text { for }p \in [s]. \end{aligned}$$

We consider expectation values over the probability measure on [d] where i has probability $\lambda _{i}$. Since $\eta _p \le 1$ for all p, by Jensen’s inequality for $\eta _p \ne 0$

$$\begin{aligned} \left( \mathbbm {E}f_p^{\frac{1}{\eta _p}}\right) ^{\eta _p} \ge \mathbbm {E}f_p \end{aligned}$$

(D.6)

with equality if and only if the spectrum is flat or $\eta _p = 1$. Note that for $\eta _p = 0$ (equivalently $l_p = 1$), $\mathbbm {E}f_p = 1$. Using that

$$\begin{aligned} \sum _{p=1}^{s+1} \eta _p&= \sum _{p=1}^{s} \frac{l_p - 1}{l-1} + \frac{s-1}{l-1} = \frac{l - s + s-1}{l-1} = 1 \\ \end{aligned}$$

and for $\eta _p \ne 0$

$$\begin{aligned} \mathbbm {E}f_p^{\frac{1}{\eta _p}}&= \sum _{i=1}^d \lambda _i^{l} \end{aligned}$$

we find that

$$\begin{aligned} \prod _{p=1}^s \left( \sum _{i=1}^d \lambda _i^{l_p}\right) \sum _{i=1}^d \lambda _i^{s} = \prod _{p=1}^{s+1}\mathbbm {E}f_p \le \prod _{p, \eta _p \ne 0} \left( \mathbbm {E}f_p^{\frac{1}{\eta _p}}\right) ^{\eta _p} = \sum _{i=1}^d \lambda _i^{l}. \end{aligned}$$

Here, we have equality if and only if the spectrum is flat or for all $p \in [s+1]$ it holds that $\eta _p = 1$ or $\eta _p = 0$. Since $\eta _p < 1$ for $p \in [s]$ and $l_{s+1} > 1$, this only happens if $l_p = 1$ for $p \in [s]$ and $s = l_{s+1} = l$ (in other words, if all the merged cycles are 1-cycles). Applying this with $s=m_j$ for each $j\in J$ proves Eq. (D.5). $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Cheng, N., Lancien, C., Penington, G. et al. Random Tensor Networks with Non-trivial Links. Ann. Henri Poincaré 25, 2107–2212 (2024). https://doi.org/10.1007/s00023-023-01358-2

Download citation

Received: 17 November 2022
Accepted: 14 August 2023
Published: 01 September 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00023-023-01358-2

Random Tensor Networks with Non-trivial Links

Abstract

Similar content being viewed by others

Reflected entropy in random tensor networks

Entanglement of purification: from spin chains to holography

Negativity spectra in random tensor networks and holography

1 Introduction

Theorem

Theorem

1.1 Notation and Conventions

2 Random Tensor Network States

2.1 The Replica Trick for Random Tensor Networks

2.2 Maximally Entangled Link States and Minimal Cuts

2.3 The Replica Trick for General Background States

2.3.1 Normalization of Random Tensor Network States

Lemma 1.1

Proof

3 Link States with Bounded Spectral Variation

3.1 Random Matrices, Free Probability, and Non-crossing Partitions

3.1.1 Random Matrix Theory and Wishart Matrices

Theorem 1.2

3.1.2 Free Probability

3.1.3 Non-crossing Partitions

Theorem 1.3

Proof

3.2 Entanglement Spectrum of Random Tensor Network States as a Free Product

Proposition 1.4

Proof

Theorem 1.5

Proof

Remark 3.5

3.3 Non-trivial Link States and Entanglement Negativity

4 Link States with Unbounded Spectral Variation

4.1 One-Shot Entropies

4.2 Recovery Isometries

Proposition 1.7

Proof

Lemma 1.8

Proof

Lemma 1.9

Proof

Theorem 1.10

Proof

4.3 One Minimal Cut

Lemma 1.11

Definition 1.12

Theorem 1.13

Proof

4.4 Two Competing Minimal Cuts

Lemma 1.14

Proof

Lemma 1.15

Proof

Definition 1.16

Lemma 1.17

Theorem 1.18

Proof

4.5 Computing Entropies with Two Minimal Cuts

Lemma 1.19

Proof

Corollary 1.20

Proof

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Euclidean Gravity Path Integrals and Entanglement Spectra

1.1 The Replica Trick and Euclidean Path Integrals

1.1.1 Fixed-Area States

1.2 States at a Minimal Surface Phase Transition

1.2.1 Fixed-Area States with Two Minimal Surfaces

1.2.2 General States at the Minimal Surface Phase Transition

1.3 Replica Wormholes and JT Gravity

1.3.1 A Recursion Relation for the Resolvent