# Enhancement of channel capacities with entanglement

==Introduction==

Shared entanglement between a sender and receiver can significantly improve the usefulness of a quantum channel for the communication of either classical or quantum data. Superdense coding and teleportation provide the most well-known examples of this improvement; free entanglement doubles the classical capacity of a noiseless quantum channel and makes it possible for a noiseless classical channel to send quantum data. In fact, the entanglement-assisted classical and quantum capacities of a quantum channel are in many senses simpler and better behaved than their unassisted counterparts~\cite{H98,SW97,D05}. Most importantly, these capacities can be calculated using simple formulas and finite optimization procedures~\cite{BSST99,BSST02}. No such finite procedure is known for either of the unassisted capacities. Moreover, the entanglement-assisted classical and quantum capacities are related by a simple factor of two. The unassisted capacities, in contrast, have completely different formulas. In fact, the simple factor of two generalizes to a statement known as the quantum Reverse Shannon Theorem, which governs the rate at which one quantum channel can simulate another~\cite{QRST}. The answer is given by the ratio of the entanglement-assisted capacities.

Notation: Quantum systems will be denoted by A, B, and so on as well as their variants such as Aʹ and . The choice of letter will generally indicate which party holds a given system, with A reserved for the sender, Alice, and B for the receiver, Bob. Given a quantum system C, C ⊗ n will often be written as Cn. These symbols will be used to denote both the Hilbert space of the quantum system and the set of density operators on that system. Thus, a quantum channel N: Aʹ → B refers to a trace-preserving, completely positive (TPCP) map from the operators on the Hilbert space of Aʹ to those of B. idC refers to the identity channel on C. The map N ⊗ idC will frequently be abbreviated to N in order to simplify long expressions. φ will be abbreviated to φ. πC will refer to the maximally mixed state on C and πd to the maximally mixed state on a specified d-dimensional quantum system.

For a given quantum state φAB on the composite system AB, φA = trBφAB and

H(A)_\varphi = H(\varphi^A) = - {\operatorname{tr}} ( \varphi^A \log_2 \varphi^A )

is the von Neumann entropy of φA, while

H(A|B)_\varphi = -I_c(A\rangle B) = H(AB)_\varphi - H(B)_\varphi

is its conditional entropy and I(A; B)φ = H(A)φ + H(B)φ − H(AB)φ its mutual information.

### Entanglement-assisted classical and quantum capacities

The entanglement-assisted classical capacity of a quantum channel N : Aʹ → B is the optimal rate at which classical information can be communicated through the channel while in addition making use of an unlimited number of maximally entangled states.

The formal definition proceeds as follows. Alice and Bob are assumed to share nS ebits in the form of a maximally entangled state ϕ of Schmidt rank 2nS. Conditioned on her message m ∈ {1, 2, …, 2nR}, Alice will apply an encoding operation Em:  → Aʹn. Bob's decoding is given by a POVM m}m = 12nR on the composite system Bn. The procedure is said to have maximum probability of error ε if

\label{eqn:errorProb}

\max_m\ {\operatorname{tr}}\Big[ \Lambda_m ({\mathcal{N}}^{\otimes n} \circ {\mathcal{E}}_m)(\phi) \Big] \geq 1 - \epsilon.

These elements, illustrated in Figure~\ref{fig:C_E}, consisting of the shared entanglement, as well as the encoding and decoding operations meeting the criterion of Eq.~(\ref{eqn:errorProb}), are called a (2nR, 2nS, n, ε) entanglement-assisted classical code for the channel N. A rate R is said to be achievable if there exists a choice of S ≥ 0 and a sequence of entanglement-assisted classical codes (2nR, 2nS, n, εn) with εn → 0. The entanglement-assisted classical capacity CE(N) of N is defined to be the supremum over all achievable rates.

Theorem [BSST~\cite{BSST99,BSST02}] \label{thm:C_E} The entanglement-assisted classical capacity CE of a quantum channel N: Aʹ → B is given by

\label{eqn:C_E}

C_E({\mathcal{N}}) = \max_\sigma I(A;B)_\sigma,

where the maximization is over states σAB = N(φAAʹ) arising from the channel by acting on the Aʹ half of any pure state φAAʹ. \end{theorem}

The theorem bears a strong formal resemblance to Shannon's noisy coding theorem for the classical capacity of a classical noisy channel. There the capacity formula is also given by an optimization of the mutual information, but over joint distributions between the input and output alphabets arising from the action of the channel. Such a joint distribution cannot exist in general for a quantum channel because the no-cloning theorem excludes the possibility of the input and output existing simultaneously. Eq.~(\ref{eqn:C_E}) instead refers to a natural substitute for the joint input-output distribution: a quantum state arising from the quantum channel acting on half of an entangled pure state.

Another point worth stressing is that, unlike the known formulas for the unassisted classical and quantum capacities of a quantum channel, Eq.~(\ref{eqn:C_E}) refers to only a single use of N instead of the limit of many uses, N ⊗ n. The formula can therefore readily be used to evaluate CE for any channel of interest.

Consider, for example, the d-dimensional depolarizing channel

{\mathcal{D}}_p(\rho) = (1-p)\rho + p \, \pi_d

that with probability p completely randomizes the input but otherwise leaves the input invariant. For such channels, the maximum is achieved by choosing a maximally entangled state for $|\varphi&gt;^{AA'}$, yielding

C_E({\mathcal{D}}_p) = 2 \log_2 d - h_{d^2}\Big( 1 - p \frac{d^2-1}{d^2} \Big),

where for any 0 ≤ q ≤ 1 and integer r ≥ 1,

h_r(q) = -q \log_2 q - (1-q) \log_2 \Big(\frac{1-q}{r-1}\Big)

is the Shannon entropy of the distribution $\big(q,(1-q)/(r-1),\ldots,(1-q)/(r-1)\big)$.

Entanglement-assistance also simplifies the relationship between the classical and quantum capacities of a channel. Proceeding as before to formally define the quantum capacity, Alice and Bob are again assumed to share a maximally entangled state $|\phi&gt;^{\tilde{A}\tilde{B}}$ of Schmidt rank 2nS. Alice's encoding operation will be a TPCP map E:  → Aʹn acting on an input system and her half of the shared entanglement, . Bob's decoding will likewise be a TPCP map D: Bn →  acting on the output of the channel, Bn, and his half of the shared entanglement, . and are assumed to be isomorphic quantum systems of some fixed dimension 2nQ. The procedure is said to have subspace fidelity 1 − ε if

\label{eqn:fidelity}

\,^{\hat{B}}<\varphi | ({\mathcal{D}} \circ {\mathcal{N}}^{\otimes n} \circ {\mathcal{E}})(\phi^{\tilde{A}\tilde{B}} \otimes \varphi^{\hat{A}}) |\varphi\rangle^{\hat{B}} \geq 1 - \epsilon

for all $|\varphi&gt;^{\hat{A}} \in \hat{A}$. These elements, illustrated in Figure~\ref{fig:Q_E}, are together called a (2nQ, 2nS, n, ε) entanglement-assisted quantum code for the channel {\mathcal{N}}. A rate Q is said to be achievable if there exists a choice of S ≥ 0 and a sequence of entanglement-assisted quantum codes (2nR, 2nS, n, εn) with εn → 0. The entanglement-assisted quantum capacity QE(N) of N is defined to be the supremum over all achievable rates.

There is considerable freedom in the definition of the entanglement-assisted quantum capacity. It could, for example, be defined as the largest amount of maximal entanglement that can be generated using the channel, minus the entanglement consumed during the protocol itself. Alternatively, the fidelity criterion Eq.~(\ref{eqn:fidelity}) could be strengthened to require that D ∘ N ⊗ n ∘ E preserve not only pure states on but any entanglement between and a reference system. All of these variants yield the same capacity formula:

Q_E({\mathcal{N}}) = \frac{1}{2} C_E({\mathcal{N}}).

This equivalence is a direct consequence of the existence of the teleportation and superdense coding protocols. When maximal entanglement is available, teleportation converts the ability to send classical data into the ability to send quantum data at half the classical rate. Conversely, by consuming maximal entanglement, superdense coding converts the ability to send quantum data into the ability to send classical data at double the quantum rate.

### Sketch of proof

The proof of a capacity theorem can usually be broken into two parts, achievability and optimality. The achievability part demonstrates the existence of a sequence of codes reaching the prescribed rate while the optimality part shows that it is impossible to do better.

The main idea in the achievability proof can be understood by studying the special case where φAʹ = πAʹ. Let dAʹ = dimAʹ and {Uj}j = 1dAʹ2n be a set of Weyl operators for Aʹn. The relevant property of these operators is that averaging over them implements the constant map: for all density operators ρ,

\frac{1}{d_{A'}^{2n}} \sum_{j=1}^{d_{A'}^{2n}} U_j \, \rho \, U_j^\dagger = \pi^{A'^n}.

Consider the state σj that arises if Alice acts with Uj on the Aʹn half of a rank dAʹn maximally entangled state $|\varphi &gt;^{AA'^n}$ and then sends the Aʹn half of the resulting state through N. (Note that here Aʹn also plays the role of .) The entropy of the resulting state is

H(\sigma_j)= H\big( {\mathcal{N}}( ( U_j \otimes I_{\tilde{B}} ) \varphi ( U_j^\dagger \otimes

I_{\tilde{B}} ) ) \big) \\
= H\big( {\mathcal{N}}( \varphi ) \big)

since Uj does not change the local density operator on Aʹn.

On the other hand, if Alice selects a value of j from the uniform distribution, then the resulting average input state to the channel will be

\pi^{A'^n} \otimes \pi^{A} = \varphi^{A'^n} \otimes \varphi^{A}

and the corresponding average output state will be N(φAʹn) ⊗ φA, which has entropy

H( {\mathcal{N}}(\varphi^{A'^n}) ) + H( \varphi^{A} ).

Therefore, the Holevo quantity of the ensemble of output states, defined as the entropy of the average state minus the average of the entropies of the individual output states, will be equal to

H(\varphi^{A}) + H({\mathcal{N}}(\varphi^{A'^n})) - H\big( {\mathcal{N}}( \varphi^{AA'^n} ) \big).

This is precisely the quantity I(A; B)σ for the state N(φAAʹn) since the channel N transforms the Aʹn system into B. Moreover, if Bob is given the A part of the maximally entangled state, then this is the Holevo quantity of an ensemble of states that can be produced by Alice acting on half of a shared entangled state and then sending her half through the channel. Invoking the HSW theorem for the classical capacity~\cite{H98,SW97} therefore completes the proof; using coding, the Holevo quantity is an achievable communication rate.

The proof that Eq.~(\ref{eqn:C_E}) is optimal involves a series of entropy manipulations similar to the optimality proofs for the unassisted classical and quantum capacities. From the point of view of quantum information, the truly unusual part of the proof is the demonstration that it is unnecessary to consider multiple copies of N~\cite{CA97}. Specifically, let

f({\mathcal{N}}) = \max_\sigma I(A;B)_\sigma,

where the maximization is defined as in Theorem~\ref{thm:C_E}. Techniques analogous to those used for the unassisted capacities yield the upper bound

C_E({\mathcal{N}}) \leq \lim_{n\rightarrow\infty} \frac{1}{n} f({\mathcal{N}}^{\otimes n}).

Unlike the unassisted case, however, a relatively easy argument shows that

\label{eqn:additivity}

f({\mathcal{N}}_1 \otimes {\mathcal{N}}_2) = f({\mathcal{N}}_1) + f({\mathcal{N}}_2).

(The analogous statement is an important conjecture for the classical capacity and is known to be false for the quantum capacity~\cite{DSS98}.) As a result, CE(N) ≤ f(N), which is the optimality part of Theorem~\ref{thm:C_E}.

To see the origin of Eq.~(\ref{eqn:additivity}), it will be helpful to invoke Stinespring's theorem to write Nj = trEjUjBjEj, where Uj: Aʹj → BjEj is an isometry. Fix a state $|\varphi &gt;^{AA_1'A_2'}$ and let σ = (U1 ⊗ U2)(φ). Eq.~(\ref{eqn:additivity}) follows from the fact that

\label{eqn:infoInequality}

I(A;B_1B_2)_\sigma

\leq I(AB_2E_2;B_1)_\sigma + I(AB_1E_1;B_2)_\sigma.

Simply redefining A to be AB2E2 shows that the first term of the righthand side is upper bounded by f(N1). The second term, likewise, is upper bounded by f(N2). Eq.~(\ref{eqn:infoInequality}) is itself equivalent to the inequality \begin{multline} H(B_1 B_2 | E_1 E_2)_\sigma + H(B_1 B_2)_\sigma \\ \;\;\leq H(B_1|E_1)_\sigma + H(B_2|E_2)_\sigma + H(B_1)_\sigma + H(B_2)_\sigma. \end{multline} The inequality H(B1B2)σ ≤ H(B1)σ + H(B2)σ holds by the subadditivity of the von Neumann entropy. Repeated applications of the \emph{strong} subadditivity inequality, moreover, lead to the inequality

H(B_1 B_2 | E_1 E_2)_\sigma \leq H(B_1|E_1)_\sigma + H(B_2|E_2)_\sigma.

Together, they prove Eq.~(\ref{eqn:infoInequality}) and, thence, Eq.~(\ref{eqn:additivity}). The intuitive meaning of this single-letterization'' is unclear, but regardless, it is interesting to note that the proof involved invoking a pair of purifying environment systems, E1 and E2, and studying the entropy relationships between the true outputs of the channel and the environment's share.

### The quantum Reverse Shannon Theorem

A strong argument can be made that the entanglement-assisted capacity of a quantum channel is the most important capacity of that channel and that all the other capacities are, in some sense, of less significance. The fact that it is unnecessary to distinguish between the classical and quantum entanglement-assisted capacities because they are related by a factor of two is a hint in that direction, as is the simple, single-letter formula for CE(N).

A more general argument can be made by considering the problem of having one channel simulate another. Indeed, the quantum capacity of a quantum channel is simply the optimal rate at which that channel can simulate the noiseless channel id2 on a single qubit. Likewise, the classical capacity of a quantum channel is its optimal rate for simulation of a qubit dephasing channel

\rho \mapsto |0\rangle\langle 0| \, \rho \, |0\rangle\langle 0|

   + |1\rangle\langle 1| \, \rho \, |1\rangle\langle 1|.

In this spirit, the fact that CE(N) = 2QE(N) can be re-expressed in the form

Q_E({\mathcal{N}}) = \frac{C_E({\mathcal{N}})}{C_E({\operatorname{id}}_2)}.

Equivalently, when entanglement is free, the optimal rate at which N can simulate a noiseless qubit channel is given by the ratio between the entanglement-assisted classical capacities of N and id2. The quantum Reverse Shannon Theorem generalizes this statement to the simulation of arbitrary channels in the presence of free entanglement.

Suppose that Alice and Bob would like to use N1: Aʹ → B to simulate another channel N2: Aʹ → B. Fix an input state φAʹ and let $|\varphi &gt;^{AA'^n}$ be a purification of (φAʹ) ⊗ n. As always, assume that Alice and Bob share a maximally entangled state $|\phi &gt;^{\tilde{A}\tilde{B}}$ of Schmidt rank 2nS. Alice's encoding operation will be a TPCP map E: Aʹn → Aʹm acting on n copies of the input system Aʹ and her half of the shared entanglement, . Bob's decoding will likewise be a TPCP map D: Bm → Bn acting on m copies of the output of the channel, and his half of the shared entanglement, . This procedure is said to ε-simulate N2 ⊗ n on (φAʹ) ⊗ n if

F\big({\mathcal{N}}_2^{\otimes n}(\varphi^{AA'^n}),({\mathcal{D}}\circ{\mathcal{N}}_1^{\otimes m}\circ {\mathcal{E}})(\phi^{\tilde{A}\tilde{B}} \otimes \varphi^{AA'^n}) \big) \geq 1 - \epsilon,

where F is the mixed state fidelity $F(\rho,\sigma) = \big( {\operatorname{tr}}\sqrt{\rho^{1/2}\sigma\rho^{1/2}}\big)^2$. The entire procedure, illustrated in Figure~\ref{fig:QRST}, is said to be a (2nS, m, n, ε) entanglement-assisted simulation of N2 by N1. A rate R, measured in copies of N2 per copy of N1, is said to be achievable for φAʹ if there exists a choice of S ≥ 0 and a sequence of (2nS, mn, n, εn) entanglement-assisted simulations with n/mn → R while εn → 0.

The quantum Reverse Shannon Theorem states that the entanglement-assisted capacity completely governs the achievable simulation rates. \begin{theorem}[BDHSW~\cite{W02,QRST}] \label{thm:QRST} Given two channels N1: Aʹ → B and N2: Aʹ → B, R is an achievable simulation rate for N2 by N1 and all input states φAʹ if and only if

\label{eqn:simulationRatio}

R \leq \frac{C_E({\mathcal{N}}_1)}{C_E({\mathcal{N}}_2)}.

\end{theorem} Note that the form of Eq.~(\ref{eqn:simulationRatio}) ensures that the simulation is asymptotically reversible: if a channel N1 is used to simulate N2 and the simulation is then used to simulate N1 again, then the overall rate becomes

\frac{C_E({\mathcal{N}}_1)}{C_E({\mathcal{N}}_2)}\frac{C_E({\mathcal{N}}_2)}{C_E({\mathcal{N}}_1)} = 1.

Thus, in the presence of free entanglement and for a known input density operator of the form (φAʹ) ⊗ n, a single parameter, the entanglement-assisted classical capacity, suffices to completely characterize the asymptotic properties of a quantum channel. Moreover, since two channels that are asymptotically equivalent without free entanglement will surely remain equivalent if free entanglement is permitted, Eq.~(\ref{eqn:simulationRatio}) gives essentially the only possible non-trivial, single-parameter asymptotic characterization of quantum channels. This is the sense in which the entanglement-assisted capacity should be regarded as the most important capacity of a quantum channel.

The proof of the quantum Reverse Shannon Theorem is quite involved, but some of its features can be understood without much work. First, note that by the optimality statement of the entanglement-assisted classical capacity, the desired simulation can exist only if Eq.~(\ref{eqn:simulationRatio}) holds. Otherwise, composing the simulation of N2 by N1 with a sequence of codes achieving CE(N2) would result in a sequence of codes beating the capacity formula for N1.

Similarly, note that one method to simulate a channel N1 using N2 is to first use N2 to simulate the noiseless channel and then use the simulated noiseless channel to simulate N1. Since the achievable rates for the first step are characterized by the entanglement-assisted capacity theorem, proving the achievability part of Theorem \ref{thm:QRST} reduces to finding protocols for simulating a general noisy quantum channel N2 by a noiseless one. That perhaps sounds like a strange goal, but it is the difficult part of the quantum Reverse Shannon Theorem nonetheless.

It is likely that the quantum Reverse Shannon Theorem can be extended to cover other types of inputs than the known tensor power states (φAʹ) ⊗ n. The most desirable form of the theorem would be one valid for all possible input density operators on Aʹ ⊗ n, providing a single simulation procedure dependent only on the channels and not the input state. It is known that without modifying the form of the free entanglement, this most ambitious form of the theorem fails, but it is conjectured that the full-strength theorem does hold provided very large amounts of entanglement are supplied in the form of so-called embezzling states~\cite{DH03}.

### Relationships between protocols

There is another sense in which the entanglement-assisted capacity can be viewed as the fundamental capacity of a quantum channel: an efficient protocol for achieving the entanglement-assisted capacity can be converted into protocols achieving the unassisted quantum and classical capacities, or at least very close variants thereof.

An efficient protocol in this case refers to one that doesn't waste entanglement. Suppose that N: Aʹ → B can be written trEUBE for some isometry UBE. Let φAAʹ be a pure state and σABE = UBEφAAʹ the corresponding purified channel output state. Careful analysis of the entanglement-assisted classical communication protocol achieving the rate I(A; B)σ leads to an entanglement-assisted quantum communication protocol consuming entanglement at the rate $\frac{1}{2}I(A;E)_\sigma$ ebits per use of N and yielding communication at the rate of $\frac{1}{2}I(A;B)_\sigma$ qubits per use N. The protocol achieving this goal is known as the father''~\cite{DHW04}.

If the entanglement consumed in the father were actually supplied by quantum communication from Alice to Bob, then the net rate of quantum communication produced by the resulting protocol would be $\frac{1}{2}I(A;B)_\sigma-\frac{1}{2}I(A;E)_\sigma$ qubits from Alice to Bob, that is, the total produced minus the total consumed.

This quantity, how much more information B has about A than E does, can be simplified using an interesting identity. Since $|\sigma &gt;^{ABE}$ is pure,

I(A;E)_\sigma = H(A)_\sigma + H(E)_\sigma - H(AE)_\sigma \\

= H(A)_\sigma + H(AB)_\sigma - H(B)_\sigma.

Expanding I(A; B)σ and cancelling terms then reveals that

\frac{1}{2}I(A;B)-\frac{1}{2}I(A;E)

= -H(A|B)_\sigma
= I_c(A\rangle B)_\sigma,

where the function Ic is known as the coherent information. After optimizing over input states and multiple channel uses, this is precisely the formula for the unassisted quantum capacity of a quantum channel~\cite{D05}. Thus, the net rate of qubit communication for the protocol derived from the father exactly matches the rates necessary to achieve the unassisted quantum capacity. The only caveat is that the protocol derived from the father uses quantum communication catalytically, meaning that some communication needs to be invested in order to get a gain of Ic(AB). For the unassisted quantum capacity, no investment is necessary. Nonetheless, detailed analysis of the situation reveals that the amount of catalytic communication required can be reduced to an amount sublinear in the number of channel uses, meaning the rate of required investment can be made arbitrarily small. In this sense, the father protocol essentially generates the optimal protocols for the unassisted quantum capacity.

Protocols achieving the unassisted classical capacity can be constructed in a similar way. In this case, one starts from an ensemble E = {pj, N(ψjAʹ)} of states generated by the channel. Achievability of the unassisted classical capacity formula follows from achievability of rates of the form

\chi({\mathcal{E}}) = H\big(\sum_j p_j {\mathcal{N}}(\psi_j^{A'})\big)

       - \sum_j p_j H\big( {\mathcal{N}}(\psi_j^{A'})\big)

for arbitrary ensembles of output states. Consider the channel

\tilde}(\rho) = \sum_j \langle j| \rho |j\rangle \cdot {\mathcal{N}}(\psi_j)

and input state $|\varphi \rangle^{AA'} = \sum_j \sqrt{p_j} |j\rangle^{A} |j\rangle^{A'}$. If $\sigma = \tilde{{\mathcal{N}}}(\varphi)$, then I(A; B)σ is equal to χ(E). Thus, there are protocols consuming entanglement that achieve the classical communications rate χ(E) for the modified channel $\tilde{{\mathcal{N}}}$. Because the channel $\tilde{{\mathcal{N}}}$ includes an orthonormal measurement which destroys all entanglement between A and B, however, it can be argued that any entanglement used in such a protocol could be replaced by shared randomness, which could then in turn be eliminated by a standard derandomization argument. The net result is a procedure for choosing rate χ(E) codes for the channel N consisting of states of the form ψj1 ⊗ ⋯ ⊗ ψjn, which is the essence of the achievability proof for the unassisted classical capacity.

This may seem like an unnecessarily cumbersome and even circular approach to the unassisted classical capacity given that the proof sketched above for the entanglement-assisted classical capacity itself invokes the unassisted result in the form of the HSW theorem. The approach becomes more satisfying when one learns that simple and direct proofs of the father protocol exist that completely bypass the HSW theorem~\cite{ADHW05}.

Thus, the entanglement-assisted communication protocols can be easily transformed into their unassisted analogs, confirming the central place of entanglement-assisted communication in quantum information theory.

Category:Quantum Communication