• 沒有找到結果。

Quantum Information Theory

N/A
N/A
Protected

Academic year: 2022

Share "Quantum Information Theory"

Copied!
20
0
0

加載中.... (立即查看全文)

全文

(1)

Quantum Information Theory

Scope:

1. Transmission of classical information over quantum channels.

2. The tradeoff between acquisition of quan- tum state information and disturbance of the state.

3. Quantifying quantum entanglement.

4. Transmission of quantum information over quantum channels.

Mainly accomplished by the interpretation and application of the Von Neumann entropy.

(2)

Classical Information Theory

A message is a string of letters chosen from an alphabet of k letters

{a1, a2, . . . , ak}.

The letters are independent and occurs with probability p(ax), and Pkx=1 p(ax) = 1.

A typical message of length n will contain np(ax) ax’s for each x. So the number of typical strings is

n!

Qk

x=1(np(ax))!.

By the Stirling approximation log n! = n log n − n + O(log n) we have

log n!

Qk

x=1(np(ax))!

= log n! −

k X x=1

log(np(ax))!

2

(3)

= n log n − n −

k X x=1

(np(ax) log np(ax) − np(ax))

= n log n −

k X x=1

np(ax)(log n + log p(ax))

= −n

k X x=1

p(ax) log p(ax)

= nH(X), where

H(X) = −

k X x=1

p(ax) log p(ax)

is the Shannon entropy of the ensemble X =

(4)

So there are approximately 2nH(X)

typical strings of length n for the letter en- semble X. Hence if we consider the typical strings as the only strings that can appear, then a string of length n can be compressed to nH(X) bits, that is, only nH(X) bits are needed to store any length n string.

The noiseless coding theorem states that the optimal code compresses each letter to H(X) bits asymptotically. It’s the highest compres- sion rate given the requirement that messages must be decoded without errors as n → ∞.

(5)

Another Perspective

For a particular length n message x1x2 . . . xn,

its prior probability is

P (x1x2 . . . xn) = p(x1)p(x2) . . . p(xn) and

log P (x1x2 . . . xn) =

n X i=1

log p(xi).

By the central limit theorem, when n is large enough most messages has probability P sat- isfying

−1

n log P (x1x2 . . . xn) = −1 n

n X i=1

log p(xi)

≈ h− log p(x)i

≡ H(X),

where the random varaiable x represent a letter chosen from X.

(6)

So for the typical sequences its probability P satisfies

H(X) − δ < −1

n log P (x1x2 . . . xn) < H(X) + δ, or

2−n(H(X)−δ) > P (x1x2 . . . xn) > 2−n(H(X)+δ), where δ > 0 is small.

(7)

Interpretation of Shannon Entropy

The Shannon entropy for a specific source X can be seen as the amount of our ignorance about the value of the next letter, or the amount of indeterminancy of the unknownm letter. It can also be seen as the amount of information we gain after receiving one letter, in the usual case where the logarithm in H is with base 2, the unit of H is bits.

(8)

Binary Entropy

Suppose that the alphabet is bits, that is, X = {0, 1}, with probability p0 = p and p1 = 1 − p.

The entropy for this case is

H(X) = H(p) = −p log p − (1 − p) log(1 − p).

When p0 = 12, the bit is completely random, hence

H(1

2) = 1

is the maximum attainable value for the en- tropy, that is, we are maximally ignorant about the value of the next letter, or that we gain the most information (one bit) by receiving one let- ter. When p0 = 1 or p1 = 1, the next bit is completely predictable, the entropy in this case is

H(0) = H(1) = 0,

so we are not ignorant about the value of next bit at all, it also means that we get no infor- mation by receiving one letter. All other cases have entropy between these two extremes.

5

(9)

Generalize the result in the previous section to general sources X, the entropy H is zero whenever any one of the letters occurs with certainty. That is,

Hmin(X) = − log 1 = 0.

And maximum entropy is achieved when all let- ters occur with equal probability, that is, with a uniform probability distribution. For a X with d letters, the entropy of uniform probability is

Hmax(X) = − X

i

1

d log 1

d = − log 1

d = log d.

In the general case of source X with d letters, its information per letter is

0 ≤ H(X) ≤ log d.

(10)

Relative Entropy

If p(x) and q(x) are two probability distributions over the same index set x (or a given set of letters), then the relative entropy of p(x) to q(x) is defined as

H(p(x)||q(x)) ≡ X

x

p(x) log p(x) q(x)

= −H(X) − X

x

p(x) log q(x).

The relative entropy is a measure of the close- ness of these two probability distributions. Since ln y ≤ y − 1, we have

H(p(x)||q(x)) = −X

x

p(x) log q(x) p(x)

≥ 1

ln 2

X x

p(x) 1 − q(x) p(x)

!

= 1

ln 2

X x

(p(x) − q(x))

= 0.

With equality when p(x) = q(x) for all x.

7

(11)

Mutual Information

Suppose a message composed from X are trans- mitted through a noisy channel, and a mes- sage composed from Y is received, that is, the channel distorts a letter x ∈ X into y ∈ Y with conditional probability p(y|x). When the mes- sage is received, the probability distribution for x can be updated to

p(x|y) = p(y|x)p(x) p(y) ,

where p(y|x) represent properties of the chan- nel, p(x) the a priori probabilities of ensemble X, and p(y) = Px p(y|x)p(x). So the message composed from Y contains some information about the original message from X. Using the p(x|y)’s we can defined the conditional entropy as

H(X|Y ) = h− log p(x|y)i = −X

xy

p(x, y) log p(x|y).

(12)

Note that

H(X|Y ) = h− log p(x, y) + log p(y)i

= h− log p(x, y)i − h− log p(y)i

= H(X, Y ) − H(Y ),

where H(X, Y ) ≡ −Pxy p(x, y) log p(x, y), simi- larly

H(Y |X) = H(X, Y ) − H(X).

We need H(X) bits per letter to decode mes- sages from X, after receiving via the noisy channel a message from Y , we need H(X|Y ) more bits per letter to decode the message. In other words

I(X; Y ) = H(X) − H(X|Y )

= H(X) + H(Y ) − H(X, Y )

= H(Y ) − H(Y |X).

bits of information per letters is gained by re- ceiving the distorted message. I(X; Y ) is the mutual information, which is symmetric.

(13)

From the properties of the logarithm we have H(X) ≥ H(X|Y ) ≥ 0,

H(Y ) ≥ H(Y |X) ≥ 0, so

I(X; Y ) ≥ 0,

H(X) + H(Y ) ≥ H(X, Y ).

That is, we will not lose any knowledge of a message from X by receiving a message from Y .

Equality occurs when X and Y is independent, then

I(X; Y ) = H(X) − H(X|Y )

= H(X) − h− log p(x|y)i

= H(X) −

*

− log p(x, y) p(y)

+

= H(X) −

*

− log p(x)p(y) p(y)

+

= H(X) − h− log p(x)i

(14)

The Noisy Coding Theorem

With X = {x, p(x)} for the input letters, we send a length n message through a memory- less noisy channel specified by p(y|x)’s. The output letters Y = {y, p(y)} can be found by knowledge of X and the channel.

Intuitively it seems we can send no more than I(X; Y ) bits per letter over the noisy chan- nel, the value of which depends on the p(y|x)’s (channel) and p(x)’s (input ensemble). This is the noisy coding theorem.

9

(15)

Coding and Transmission of Messages Using Quantum States

The quantum equivalent of the previous situa- tion is to replace message letters with quantum states. Suppose for a particular physical sys- tem we have the states |ψxi each occuring with probability p(x), where Px p(x) = 1. Then the density operator for a particular state (letter) is

ρ = X

x

p(x)|ψxihψx|.

Since the states |ψxi may not be mutually or- thogonal, different states are not completely distinguishable, that is, they overlap in the state space, hence the entropy for this case is not

H(X) = −X

x

p(x) log p(x).

Two overlapping letters are not exactly two letters, they are effectively less than two let- ters, although always more or same as one let- ter.

(16)

Von Neumann Entropy

The Von Neumann entropy for the density operator defined previously is defined as

S(ρ) ≡ −tr (ρ log ρ) .

The logarithm of a matrix is defined as the inverse of the exponential of a matrix. For matrices A and B if

eA =

X n=0

An

n! = B, then

log B = A.

The logarithm of a matrix is normally very hard to calculate, but for diagonal matrix A where Aij = δijai, its exponential is

(eA)ij = δijeai = Bij,

so B is diagonal, with Bij = δijbi we have (log B)ij = δij log bi.

11

(17)

Since any density operator can be diagonalized, suppose the eigenvalues of ρ is λi, that is,

ρ = X

i

λiiihϕi|,

where the |ϕii’s are mutually orthonormal, then S(ρ) = −X

i

λi log λi.

This is the same as the entropy of an ensem- ble of letters each with probabilities λi, since all density operators have unit trace. This equal- ity is not surprising since orthogonal states are completely distinguishable, hence can be treated as classical letters.

Density operators can also be treated as let- ters, for the ensemble X = {ρx, p(x)}, the den- sity operator for each letter is

ρ = X

x

p(x)ρx.

This is the most general case in which even individual letters are in a mixed state, but how

(18)

The Von Neumann entropy represents three physical quantities:

1. The quantum information per letter.

2. The classical information per letter.

3. The amount of entanglement.

Yet the theories and methods developed by use of the Von Neumann entropy may somehow be limited due to large correspondence with clas- sical information theory. For example, letters are generally represented physically as mixed states rather than pure states, that is, without relative phase information. The Von Neumann entropy may be a special case of a more gen- eral complex entropy?

(19)

Quantum (Von Neumann) Relative entropy

This is the quantum version of relative entropy.

For density matrices ρ1 and ρ2, the relative entropy of ρ1 to ρ2 is defined as

S(ρ1||ρ2) ≡ tr (ρ1 log ρ1) − tr (ρ1 log ρ2) .

The relative entropy is likewise non-negative, and equals zero when ρ1 = ρ2.

Diagonalize both ρ1 and ρ2: ρ1 = X

i

piiihψi|, ρ2 = X

i

qiiihϕi|, then

S(ρ1||ρ2)

= S(ρ1) − tr (ρ1 log ρ2)

= X

i

pi log piX

i

i1 log ρ2ii

= X

i

pi log piX

i

pii| log ρ2ii

(20)

= X

i

pi log piX

i

pii|

X

j

(log qj)|ϕjihϕj|

ii

= X

i

pi log piX

i

pi X

j

iji 2 log qj

= X

i

pi

log piX

j

iji 2 log qj

Since the logarithm is strictly concave, we have

X j

iji 2 log qj ≤ log

X

j

iji 2 qj

, with equality if and only if ∀i∃j|ϕji = |ψii, so S(ρ1||ρ2) = X

i

pi

log piX

j

iji 2 log qj

X

i

pi

log pi − log

X

j

iji 2 qj

in the case of equality, S(ρ1||ρ2) = X

i

pi (logpi − log qi) ≥ 0,

since S becomes a (classical) relative entropy.

參考文獻

相關文件

You are given the wavelength and total energy of a light pulse and asked to find the number of photons it

好了既然 Z[x] 中的 ideal 不一定是 principle ideal 那麼我們就不能學 Proposition 7.2.11 的方法得到 Z[x] 中的 irreducible element 就是 prime element 了..

volume suppressed mass: (TeV) 2 /M P ∼ 10 −4 eV → mm range can be experimentally tested for any number of extra dimensions - Light U(1) gauge bosons: no derivative couplings. =&gt;

For pedagogical purposes, let us start consideration from a simple one-dimensional (1D) system, where electrons are confined to a chain parallel to the x axis. As it is well known

The observed small neutrino masses strongly suggest the presence of super heavy Majorana neutrinos N. Out-of-thermal equilibrium processes may be easily realized around the

The temperature angular power spectrum of the primary CMB from Planck, showing a precise measurement of seven acoustic peaks, that are well fit by a simple six-parameter

incapable to extract any quantities from QCD, nor to tackle the most interesting physics, namely, the spontaneously chiral symmetry breaking and the color confinement.. 

• Formation of massive primordial stars as origin of objects in the early universe. • Supernova explosions might be visible to the most