矩陣積態在一維量子自旋系統的應用

(1)

國立臺灣大學理學院物理學系碩士論文

Department of Physics College of Science

National Taiwan University Master Thesis

矩陣積態在一維量子自旋系統的應用 Matrix Product State in

One Dimensional Quantum Spin Systems

蘇育正 Yu-Cheng Su

指導教授：高英哲博士 Advisor: Ying-Jer Kao, Ph.D.

中華民國 98 年 6 月

June, 2009

(2)

誌謝

研究所兩年，說長不長，說短不短，在這過程中所發生的點點滴滴，也非三言兩語便能道盡。這段時間的經歷讓我體認到一件相當重要的事情，一篇論文絕非僅僅一人便能完成；研究過程中遭遇的種種困難與挫折，有人可以討論與分擔是相當幸運的一件事。

首先我要謝謝我的指導老師高英哲教授，感謝他的指導，讓我瞭解如何從事研究以及從事研究時應有的嚴謹態度；與老師討論也獲益良多，遇到瓶頸時，與老師討論常能使問題迎刃而解。另外也要謝謝 Boston University 的 Anders Sandvik 教授。在他訪問台大期間，熱心地與我們討論以及給予我們建議，幫助我們克服瓶頸。

另外我要謝謝 group 裡的成員。謝謝昱均、俊谷、承瑋在我剛開始研究生生活時給予我的幫助，以及謝謝信智、揚智、雅琳、柯達的討論和幫忙。也要謝謝許多老師、B92 的同學們以及佳寧，在我遇到挫折時為我打氣，並且給我建議。

我也要謝謝我打工的補習班的聯立老師與主任，還有一起工作的同事們：孝武、

裕翔、澍寰、玉女和念慈姐。如果不是你們給予我經濟上的幫助與工作上的支援甚至是分享人生經驗，我想我很難走到今天這一步。

最後，我要謝謝我的家人，因為你們的支持與鼓勵，我才能擁有今天的成就，謝謝你們。

(3)

摘要

在一維量子自旋系統中，矩陣積態可作為變量數值模擬的試驗波函數。在此研究中，我們展示了兩種建構矩陣積態的方法，這些方法源自於密度矩陣重整群與量子資訊理論。我們發展了兩種在一維量子系統中矩陣積態的演算法，分別為隨機最佳化的量子蒙地卡羅變量模擬 (Variational quantum Monte Carlo simulations with stochastic optimization)與時間演化間隔消除法 (Time-evolving block

decimation)。我們推廣了隨機最佳化的方法至開放邊界 (open boundary condition) 並且探討了伊辛模型加入橫向磁場與海森堡模型。另外，我們處理了無限長的伊辛模型加入橫向磁場，我們的結果顯示量子糾纏 (quantum entanglement)與量子相變息息相關。

關鍵字：矩陣積態、量子蒙地卡羅、隨機最佳化、時間演化間隔消除法、量子糾纏

(4)

Abstract

In one-dimensional quantum spin systems, the matrix product states (MPS) can be used as a trail wave function for variational numerical simulations.

In this thesis, we investigate the construction of MPS which is related to the density matrix renormalization group (DMRG) and the Quantum information theory (QIT). We develop two algorithms, variational quantum Monte Carlo (QMC) simulations with stochastic optimization [1] and time-evolving block decimation (TEBD) [2, 3], in one dimensional systems. We generalize QMC with stochastic optimization to the open boundary condition and study the transverse Ising model and Heisenerg model. We also applied the inﬁnite TEBD algorithm [4] to the inﬁnite transverse Ising model and demonstrate that entanglement is a key ingredient in the quantum phase transition.

Keywords: matrix product state, quantum Monte Carlo, stochastic optimization, TEBD, iTEBD, entanglement

(5)

List of Figures

1.1 An illustration of frustration. The spins tend to lie anti- parallel to minimize the energy and form a competition between the nearest neighbor interaction and the next nearest neighbor interaction. . . 3 2.1 (a) Add kth new spin |sk to the system block basis states

|ψβk−1 containing k− 1 sites. (b) Form the new super block and construct the reduced density matrix for the new system block keeping only D states to form a new set of basis. . . . . 5 2.2 The recursive steps to transform the state into spin basis . . . 5 2.3 The Schmidt decomposition of |Ψ. . . . 9 3.1 A sketch for the binning procedure. . . 15 3.2 The periodic trnasverse Ising spin chain with 32 spins and

D = 6 at h = 1. The first run starts with random matrices and q₀ = 0.1. The second run starts with the optimized matrices obtained in the first run and q₀ = 0.01. In the second run, the difference between the optimized energy and the exact energy is less than 10⁻⁴. . . 19 3.3 Comparison of the D dependence for the different coefficeint

forms. The system is the open critical transverse Ising chain with 32 spins at h = 1. . . 21 3.4 The open transverse Ising spin chain with 32 spins and D = 6

at h = 1. The first run starts with random matrices and q₀ = 0.1. The second run starts with the optimized matrices obtained in the first run and q₀ = 0.01. In the second run, the difference between the optimized energy and the exact energy is less than 10⁻⁴. . . 21 3.5 Relative error in the energy as a function of D at h = 1. . . . 24 3.6 Energy per bond as a function of h for the periodic boundary

condition. . . 25

(8)

3.7 Magnetization as a function of h for the periodic boundary condition. . . 26 3.8 Correlation for the periodic boundary condition. The system

is the transverse Ising model with n = 50 and D = 16. . . . . 27 3.9 Magnetization σ^xi as a function of position i for the open

boundary condition with n = 50 and D = 16. The dotted line is the corresponding magnetization for the periodic boundary condition. . . 28 3.10 Two sublattices of the antiferro Heisenberg chain. . . 29 3.11 Relative error in the energy as a function of D. . . . 29 3.12 The convergence behavior as the symmetry terms are added.

The system size is n = 16 and D = 6 . . . 30 3.13 Optimized energy for the antiferro Heisenberg model with the

next-nearest-neighbor interaction. The system size is n = 16 and D = 6; the relative errors in the energy are less than 10⁻⁴. 32 4.1 Translationally invariance translating by two. . . 37 4.2 Update the matrices in parallel. The dashed square are up-

dated simutaneously. . . 37 4.3 Visualization of the decomposition. . . 39 4.4 The distribution of the singular values λ_i for diﬀerent h. . . . 41 4.5 The single site entanglement and the nearest site correlation. . 42 4.6 Energy convergence process. . . 44 4.7 Energy convergence as a function of h for diﬀerent D. . . 44 4.8 Magnetization as a function of h. The dotted lines are the

molecular ﬁeld results. . . 45 4.9 Magnetization as a function of h for diﬀerent D. . . 45

(9)

List of Tables

3.1 Results for the critical transverse Ising model. PBC stands for the periodic boundary condition and OBC for the open boundary condition. The statitical errors in the last displayed digits are indicated; the statistical errors are less than 10⁻⁵. . 23 3.2 Symmetry combinations . . . 31

(10)

Chapter 1 Introduction

1.1 Overview

The study of low-dimensional strongly correlated quantum systems has been one of the intriguing topics in theoretical condensed matter physics. How- ever, because of the complexity of the systems, few analytically exact solu- tions are available; on the other hand, numerical methods are able to extract useful properties from these systems. The density matrix renormalization group (DMRG) invented by White [5, 6] has provided accurate descriptions of the ground state in one dimensional strongly correlated Hamiltonians [7].

DMRG is developed to extend Wilson’s numerical renormalization group [8]

to general quantum lattice systems. The failure of Wilson’s RG is identi- ﬁed as the inability to take the boundary conditions into account. DMRG ﬁrst only apply to systems with the open boundary condition and later it is extended to the periodic boundary condition from the quantum information perspective [9, 10].

The success of DMRG is related to the matrix product states (MPS) [11] . MPS represents the wave function of the system in the DMRG method, and according to quantum information perspective [2], MPS can properly account for entanglement in one dimensional strongly correlated quantum systems.

On the other hand, DMRG in dimensions higher than one is insuﬃcient to properly account for entanglement of the system [12]. A generalization of MPS, projected enatngled-pair states (PEPS) [13], is proposed to deal with the simulation in two or higher spatial dimensions. MPS and PEPS can be used as trial wave functions [1, 13, 14, 15, 16, 17] in variational methods.

In Ref [1], MPS is combined with quantum Monte Carlo (QMC) method to sample the physical states instead of summing over all states in order to reduce the computaional cost.

(11)

Over the last few years, various algorithms for simulating the time evo- lution of one dimensional quantum systems have been developed [18]. One of the algorithms called time-evolving block decimation (TEBD) [2, 3] uses MPS and emphasizes the connection between the computaioanl cost of a simulation and the amount of entaglement in the system. The scheme is later recast into the language of DMRG [19, 20]. TEBD can be applied ef- ﬁciently to the quantum system which has a small amount of entanglement.

A generalization of TEBD, inﬁnite TEBD [4], is able to simulate one dimensional quantum lattice problems in the thermodynamic limit. The method simulates the inﬁnite system directly without resorting to extrapolation.

1.2 Lattice models

1.2.1 Transverse Ising model

In this thesis, we study the one dimensional quantum transverse Ising model for the ﬁnite system and the infnite system. The transverse Ising model is the simplest quantum spin model to exhibit a quantum phase transition [21].

The hamiltonian is given by

H =−(

i

σ_i^zσ^z_i₊₁+ hσ_i^x), (1.1)

where σ^x and σ^z are the Pauli matrices, and sum over i is over all sites.

For h < 1, the ground state of this system has long-range Ising order in the z-direction. For h > 1, the ground state becomes disorder in the z-direction.

A quantum phase transition occurs when the system goes from h < 1 to h > 1. The critical point is at h = 1. The transverse Ising model provides a clear evidence that the lattice is most entangled at the critical point.

1.2.2 Heisenberg model

We also investigate the antiferro Heisenberg model with the next nearest neighbor interaction. The Hamiltonian is given by

H =

i

J₁Si· Si+1+ J₂Si· Si+2, (1.2)

where S = ^¯h₂σ and J₁, J₂ > 0. The model is invariant under a global SU (2) rotation, so the total spin should be conserved. For J₂ = 0, the model can be solved exactly by Bethe ansatz [22]. When J₂ > 0, it can be solved by taking the ground state as a superposition of the nearest neighbor valence

(12)

bond states [23]. For J₁, J₂ > 0, there is a competition between the nearest neighbor interaction and the next nearest neighbor interaction which is called frustration. This causes the “sign problems” [24, 25] in many QMC methods which originates from the minus sign in the wave function when interchanging two fermions. An illustration of frustration is shown in Fig. 1.1.

?

Anti-parallel

Figure 1.1: An illustration of frustration. The spins tend to lie anti-parallel to minimize the energy and form a competition between the nearest neighbor interaction and the next nearest neighbor interaction.

The thesis is organized as follows. In Chapter 2, we review the construction of MPS from two perspectives and discuss the connection between them.

In Chapter 3, we review the Monte Carlo method and the stochatic optimization. We generalize this QMC method to the open boundary condition and study the transverse Ising model and the Heisenberg model. In chapter 4, we review the TEBD and iTEBD method in detail for imaginary time evo- lution. The inﬁnite transverse Ising model is studied and the results imply a connection between entangelment and quantum phase transition. In chapter 5, we summarize the results and make a conclusion.

(13)

Chapter 2 Matrix Product States

In this chapter, we present the formulation of the matrix product states (MPS) from the density matrix renormalization group (DMRG) aspect and from the quantum information theory (QIT) [26] aspect. We will also discuss the notion of entanglement and use it to justify the use of MPS as a trial wave function.

2.1 MPS from the DMRG point of view

The development of the density matrix renormalization group method [5, 6]

has enabled us to analyze and understand one-dimensional quantum many body systems with high accuracy . We shall describe the main idea of DMRG without going into too much details. First of all, we start from a smaller system which is small enough that we can diagonalize its Hamiltonian. This system is labeled as the superblock and it is divided into the system block and the enviroment block. The goal is to find a set of states of the system block which can optimally represent the superblock. We construct the reduced density matrix for the system block and diagonalize it, keeping only a number of states as basis states (e. g., D states) by dropping off the states with smaller eigenvalues in the reduced density matrix, as they are less likely to be accessed. The Hamiltonian of the system block are transformed to these basis states. Then we add a single spin to the system block and use the transformed Hamiltonian together with the added spin to construct the enviroment block; thus, the new superblock can be formed. Repeat this spin-adding procedure recursively until the system reach the desired size. In practice, the eigenvalues of the density matrix decrease rapidly so that the truncation errors are small. The name density matrix renormalization group reflects the fact that we keep those most relevant states in the density matrix.

(14)

(a) (b)

ห߰_ఉۄ_௞ିଵٔ ȁݏ_௞ۄ ȁ߰_ఈۄ_௞

Figure 2.1: (a) Add kth new spin|sk to the system block basis states |ψβk−1

containing k − 1 sites. (b) Form the new super block and construct the reduced density matrix for the new system block keeping only D states to form a new set of basis.

The theoretical foundation of the success of DMRG is pointed out by Ostlund and Rommer [11]. Their work shows that the DMRG construction is¨ closely related to position-dependent matrix product state. At each recursion step, DMRG is a particularly eﬀective way to generate a D× D projection operator A^k_α,β(s_k), which projects these states|ψβk−1⊗|sk to a larger system with a set of new basis states |ψαk. That is (Fig. 2.1(a) to Fig. 2.1(b))

|ψαk =

D β=1

s_k

A^k_αβ(s_k)|ψβk−1⊗ |sk. (2.1)

For a state |Ψ in an one-diensional system containing n spins with pe- riodic boundary conditions(s_n₊₁ = s₁), we can construct a matrix product state using Eq. (2.1) recursively as shown in Fig. 2.2. |Ψ is transformed into spin basis

|Ψ =

s₁,s₂,...,sn

tr(A¹(s₁)A²(s₂) . . . Aⁿ(s_n))|s1s₂. . . s_n, (2.2)

where tr(· · · ) is the matrix trace. This state contains 2nD² parameters ranther than 2ⁿones. Furthermore, if the system has translational symmetry,

ȁȲۄ௡ ȁȲۄ௡ିଵٔ ȁݏ_௡ۄ

ȁȲۄ_௡ିଶٔ ȁݏ_௡ିଵۄ ٔ ȁݏ_௡ۄ ȁݏ_ଵۄ ٔ ǥ ٔ ȁݏ_௡ିଵۄ ٔ ȁݏ_௡ۄ

Figure 2.2: The recursive steps to transform the state into spin basis

(15)

the matrices A become independent of position. The wave function is now written

|Ψ =

s₁,s₂,...,sn

tr(A(s₁)A(s₂) . . . A(s_n))|s1s₂. . . s_n. (2.3) This is the matrix product state we are going to use as a trial wave function to study the ground state of quantum spin systems, but before that, we would like to discuss more about the matrix product states and give the criteria for the validity of MPS as a trial wave function for such approximation.

2.2 MPS from the QIT point of view

To describe the state of n interacting quantum systems, it requiresO(exp(n)) parameters. We can see immediately that simulating such systems is a non- polynomial problem, and this makes the problem intractable for nowadays computer resources. Therefore, ﬁnding an eﬀective approximation method to simulate quantum interacting systems is valuable to our understanding of quantum many body systems.

Recently, Quantum Information Theory (QIT) has provided a diﬀerent point of view to describe condensed matter systems. A theory of entanglement has been established and has oﬀered new insights in dealing with quantum many body systems. It is possible to reduce the number of parameters needed to describe a slightly entangled system from a non-polynomial order to a polynomial order.

In this section, we will ﬁrst introduce two useful theorems in linear algebra which we are going to use to develop the theory, and then the concept of entanglement will be introduced. Finally, we construct a variant form of matrix product states and compare it with the form discussed in the previous section .

2.2.1 Tools in Linear Algebra

Linear algebra has played an important role in the formulation of quantum mechanics. The rich properties of linear algebra has helped us express the theory in a concise language, and this is especially true in the ﬁeld of Quan- tum Information Theory. We present two important tools, singular value decomposition and Schmidt decomposition, which are used in QIT but not the standard materials in quantum mechanics textbooks. No attempt is made at mathematical rigor. These tools are suitable for the study of composite quantum systems, which are the key structures in quantum many body systems.

(16)

Singular value decomposition. Let A be a m×n matrix (suppose n > m).

Then there exists an m× m unitary matrix U, an n × n unitary matrix V and an m× n diagonal matrix D such that

A = U DV^T = U

⎛

⎜⎜

⎜⎝

d₁ 0 0 0 . . . 0 0 d₂ 0 0 . . . 0

... . .. ...

0 . . . 0 d_m . . . 0

⎞

⎟⎟

⎟⎠

n

V^T,

where the diagonal elements of D are called singular values of A, and d₁ ≥ d₂ ≥ . . . ≥ dm ≥ 0. V^T is the transpose of V .

The singular value decomposition is a useful technique to factorize a general matrix into two unitary operators and a diagonal operator. Dealing with these operators is easier than dealing with a general matrix. Here is a numerical example

A =

1 2 2 1

=

√1 2 √1 1 2

√2 −^√¹₂

3 0 0 1

_√₁

2 −^√¹₂

√1 2 √1

2

. (2.4)

Schmidt decomposition. A pure state |ΨAB of a composite system , AB, can be decomposed into two orthonormal states |iA for system A, and |iB for system B such that

|ΨAB =

i

λ_i|iA|iB.

where λ_i ≥ 0 and

iλ²_i = 1 in order to normalize the state|Ψ. The num- ber of λ_i is called Schmidt rank. The vectors |iA and |iB of the Schmidt decomposition are determined up to a phase, while Schmidt rank and λ_i are uniquely determined.

The Schmidt decomposition can be proved from the singular value decomposition. We will make use of this fact to perform our simulation in section 4.3. The proof is outlined and a numerical example is given below.

Let|jA and |kB be any orthornormal basis for systems A and B. Then

|ΨAB can be written

|ΨAB =

j,k

A_jk|jA|kB,

and by the singular value decomposition of A_jk

|ΨAB =

j,k

U_jiD_iiV_ik^T|jA|kB.

(17)

Deﬁning |iA =

jU_ji|jA, |iB =

kV_ki|kB and Dii= λ_i we get

|ΨAB =

i

λ_i|iA|iB, (2.5) where λ₁ ≥ λ2 ≥ . . . ≥ 0. This completes our proof.

The numerical example is given according to Eq. (2.4)

|ΨAB = |0A|0B + 2|0A|1B + 2|1A|0B + |1A|1B

= 3[ 1

√2(|0A + |1A) 1

√2(|0B + |1B) ]

+ [ 1

√2(|0A − |1A) 1

√2(−|0B + |1B) ].

This suggests that we can perform the Schmidt decomposition through the singular value decomposition, and this approach will be used in our simula- tion to ﬁnd λ_i in section 4.3.

2.2.2 Entanglement

For a state |ΨAB, it is entangled if it cannot be decomposed into a product state |ΨA|ΨB. An example below illustrates the notion of entanglement.

|ΨAB = |0A|1B + |1A|1B

= (|0A + |1A)|1B = |ΨA|ΨB (product state)

|ΨAB = |0A|0B + |1A|1B = |ΨA|ΨB (entangled state) We can clearly see that a state is entangled if and only if the Schmidt rank is greater than one, so the entanglement between system A and B can be characterized by the Schmidt rank. The larger the Schmidt rank, the more entangled the state, and vice versa. This is the key idea in justifying the approximation of reducing the number of parameters from the one that is exponential in the systme size n to the one that is linear in n.

2.2.3 A variant form of MPS

Let us now construct a variant form of MPS [2]. Consider a pure state |Ψ

for an n body system, we ﬁrst perform the Schmidt decomposition to |Ψ

cutting the state at bond k between kth site and (k + 1)th site as shown in Fig. 2.3

|Ψ =

χk

αk=1

λ_α_k|ψα^[1···k]_k |ψα^[k+1···n]_k (2.6)

(18)

where χ_k is the Schmidt rank at bond k.

After cutting the state|Ψ at bond k, we further cut the state |ψ^α^[k+1···n]k at bond k + 1

|ψα^[k+1···n]k =

χk+1

αk+1=1

λ_α_k₊₁|ψα^[k+1]k+1|ψα^[k+2···n]k+1 , (2.7)

and expand the state |ψα^[k+1]kαk+1 in terms of the spin basis

|ψα^[k+1]k+1 =

s_k+1

Γ^[k+1]s_α_k_α_k₊₁^k+1|sk+1. (2.8)

Let us substitute Eq. (2.7) in Eq. (2.6)

|ψ^[k+1···n]_α_k =

χk+1

αk+1=1

sk+1

Γ^[k+1]s_α ^k⁺¹

kαk+1 λ_α_k₊₁|sk+1|ψ^[k+2···n]_α_k₊₁ , (2.9) and in a similar way, we can get

|ψα^[1···k]k =

χ^k⁻¹

α_k−1=1

s_k

λα_k−1Γ^[k]s_α_k₋₁^k_α_k|ψα^{[1···k−1]}k−1 |sk, (2.10)

and therefore, Eq. (2.6) becomes

|Ψ =

χk−1

αk−1=1 χk

αk=1 χk+1 αk+1=1

sk,sk+1

λ_α_k₋₁Γ^[k]s_α ^k

k−1αkλ_α_kΓ^[k+1]s_α ^k⁺¹

kαk+1 λ_α_k₊₁|ψα^{[1···k−1]}k−1 |sk|sk+1|ψα^[k+2···n]k+1 . (2.11) For a state|Ψ with periodic boundary condition, repeat the above procedure recursively, and the state |Ψ is decomposed into

|Ψ =

α₁,··· ,αn

s₁,··· ,sn

Γ^[1]s_α_n_α¹₁λ_α₁Γ^[2]s_α₁_α²₂λ_α₂. . . λ_α_n₋₁Γ^[n]s_α_n₋₁ⁿ_α_nλ_α_n|s₁s₂. . . s_n

where each α ranges from one to the corresponding Schmidt rank.

߰_ఈ^{ሾ௞ାଵڮ௡ሿ}_ೖ ߣ_ఈ_ೖ

߰_ఈ^{ሾଵڮ௞ሿ}_ೖ ȁȲۄ

bond k

Figure 2.3: The Schmidt decomposition of |Ψ.

(19)

So far the expression is exact. The Schmidt rank χ_α for each bond grows exponentially with n. There are ground states in one-dimensional many body systems which are slightly entangled so that the values of λ_α decrease rapidly with α [2]. Hence, we can obtain a good approximation of the state |Ψ by truncating λ_α at some α. We deﬁne χ_max ≡ max{χ1, χ₂, . . . , χ_n}, and for slightly entangled states, χ_max is truncated to some number D. In practice, D is determined through trial and error.

Thus, the state|Ψ is now written

|Ψ =

D α₁,··· ,αn=1

s₁,··· ,sn

Γ^[1]s_α ¹

nα₁λ_α₁Γ^[2]s_α ²

1α₂λ_α₂. . . λ_α_n₋₁Γ^[n]s_α ⁿ

n−1αnλ_α_n|s1s₂. . . s_n, (2.12) where each α runs from one to D, or, to write it in a more compact form

|Ψ =

s₁,··· ,sn

tr(Γ^[1]s¹λ^[1]Γ^[2]s²λ^[2]. . . λ^[n−1]Γ^[n]sⁿλ^[n])|s1s₂. . . s_n, (2.13)

where Γ^[k] is a D× D matrix and λ^[k] is a D-dimensional vector. This state contains n(2D²+ D) parameters rather than 2ⁿ ones.

To compare Eq. (2.2) and Eq. (2.13), we ﬁrst note the similarity between Eq. (2.1) and Eq. (2.9). After the truncation step, Eq. (2.9) is written

|ψ^[k+1···n]α_k =

D αk+1=1

sk+1

Γ^[k+1]s_α ^k⁺¹

kα_k+1 λ_α_k₊₁|sk+1|ψ^[k+2···n]α_k+1 .

Compare with Eq. (2.1)

|ψαk =

D β=1

sk

A^k_αβ(s_k)|ψβk−1⊗ |sk.

We can immediately see that A^k_αβ(s_k) ≡ Γ^[k]sαβ^kλ_β. Or, from Eq. (2.10), A^k_αβ(s_k)≡ λαΓ^[k]s_αβ^k. It depends on at which end the spin is added. Now the D states kept in DMRG iterating steps has a clear physical meaning. The number D is characterized by the amount of entanglement of the state; that is, the intrinsic computational cost is quantify by the entanglement. Also, the representation of the state |Ψ in Eq. (2.2) and (2.13) are equivalent.

Before ending this chapter, we would like to present a form of MPS for a system with open boundary condition. The wave function is written

|Ψ =

s₁,s₂,...,sn

V¹(s₁)A²(s₂)A³(s₃) . . . Aⁿ⁻¹(sn−1)Vⁿ(sn)|s1s₂. . . sn (2.14)

(20)

where V¹(s₁) and Vⁿ(s_n) are D-dimensional vectors. This form can be de- rived from the above argument with a slight change in Eq. (2.12). We approximate this representation to simulate systems with open boundary conditions.

More discussion about this form is in the next chapter.

(21)

Chapter 3 Variational quantum Monte Carlo simulation with

stochastic optimization

When an analytical solution is hard to obtain, variational methods is a useful approach to getting an approximate solution. For a system described by the Hamiltonian H, we set up a form of the trial wave function |Ψ with several parameters and implement the fact that

Ψ|H|Ψ

Ψ|Ψ ≥ Eground state. (3.1) By adjusting the parameters, we lower the energy as much as we can and obtain the approximate ground state wave function for the system H.

In this chapter, MPS is used as a trial wave function. The matrix elements are regarded as parameters and are adjusted to approach the ground state. We will ﬁrst discuss the Monte Carlo simulation, and then describe the optimization scheme. Finally, the above method is used to deal with one-dimensional quantum spin models.

3.1 Introduction to Monte Carlo simulation

Generally speaking, methods which involve making use of random numbers can be called Monte Carlo(MC) simulation. Monte Carlo simulation is an important method in condensed matter physics. Because of the complexity of many body systems, especially in strongly correlated systems where perturbation is unmanageable, many problems cannot be solved analytically.

In this situation, Monte Carlo simulation is an eﬀective method to extract statistical properties from the systems.

(22)

3.1.1 The Metropolis algorithm

The goal of the Monte Carlo simulation is to calculate the expectation value

Q, which is an ensemble average of some observable quantity Q,

Q =

ciW_c_iQ_c_i

ciW_c_i , (3.2)

where Qci is the value of the observable quantity in the ith configuration c_i and W_c_i is the weight of the ith configuration c_i. So the probability P_c_i for the ith configuration to occur is P_c_i = ^W^ci

ciW_ci. To calculate Q, we average the quantity Q_c_i over all states,Q =

iP_c_iQ_c_i; however, in reality, it is impossible to prepare all the conﬁgurations for large systems in our simulation. On the other hand, picking out the states randomly from the system is usually a poor method. In most cases, it only samples out a very small fraction of the states. The remedy of this problem is that instead of calculating the ensemble average, we calculate, through a Markov process, the average of the observable quantity by generating a chain of states which obey the probability of occurrence P_c_i.

A Markov process is a mechanism to generate a new state c_ν from a state c_μ through the transition probability t(c_μ → cν). The transition probability for a Markov process should depend only on the properties of the current conﬁgurations c_ν and cμ ,and the transition probability should not vary over time.

To generate a chain of states which obey the probability of occurrence P_c_i, we ﬁrst write down the master equation

dP_c_μ

dt =

cν

[P_c_νt(c_ν → cμ)− Pcμt(c_μ → cν)]. (3.3)

For an equilibrium system, ^dP_dt^cμ = 0; furthermore, we enforce the detail balance condition such that

P_c_νt(c_ν → cμ) = P_c_μt(c_μ→ cν). (3.4) It is now clear that if the generated conﬁgurations are distributed according to P_c_i, the transition probability t(c_ν → cμ) should satisfy

W_c_μ Wcν

= P_c_μ Pcν

= t(c_ν → cμ)

t(cμ → cν). (3.5)

Now the problem comes to “how do we manipulate the algorithm to satisfy the above condition?”. Even if we can suggest many Markov processes

(23)

to generate a new state, we may not ﬁnd one with the right transition proba- bility. The solution to this problem is the concept called an acceptance ratio.

We decompose the transition probability into two parts:

t(c_μ→ cν) = g(c_μ→ cν)a(c_μ → cν), (3.6) where g(c_μ → cν) is the selection probability, and a(c_μ → cν) is the acceptance ratio (or called the “acceptance probability”). The selection probability is the probability that generates a new state c_ν from an old state c_μ. The acceptance ratio says that when we start from a state c_μ and our algorithm generates a new state cν, we should accept the change to the new state according to the acceptance ratio. So, no matter what Markov processes we choose to generate a new state which determine the selection probability, we can always satisfy the condition Eq. (3.2) as long as we adjust the acceptance ratio.

The most famous and widely used algorithm is called the Metropolis algo- rithm [27]. In the Metropolis algorithm the selection probability g(c_μ→ cν) are all chosen to be equal. Therefore, Eq. (3.5) is now written

W_c_μ

W_c_ν = t(c_ν → cμ)

t(c_μ→ cν) = g(c_ν → cμ)a(c_ν → cμ)

g(c_μ→ cν)a(c_μ→ cν) = a(c_ν → cμ)

a(c_μ → cν). (3.7) The acceptance ratio is determined according to the weights. We deﬁne

a(c_ν → cμ) = min{1,W_c_μ Wcν

}. (3.8)

In this chapter, we deal with one-dimensional S = ¹₂ quantum spin system, using the single-spin-flip method to generate a new configuration; in other words, we flip a spin at a time and calculate the acceptance ratio to decide whether to accept the flip or not. The Metropolis algorithm goes as follows.

1. Generate a spin conﬁguration c₁ randomly as a starting conﬁguration.

2. Flip a spin in the conﬁguration c_i to generate a new one c_i₊₁ . 3. Calculate the acceptance ratio

a(c_i → ci+1) = min{1,W_c_i₊₁

W_c_i }. (3.9)

Compare the acceptance ratio a(ci → ci+1) with a random number r ∈ [0, 1), if a(ci → ci+1) > r, the conﬁguration is update to c_i₊₁, otherwise it is rejected.

4. Go back to 2.

It is a standard algorithm in the Monte Carlo simulation.

(24)

3.1.2 Measuring observable quantities

As mentioned in the previous section, to measure the expectation valueQ, we have to run over all the configurations if we naively use Eq. (3.2). However, in the Metropolis algorithm, a chain of states is generated according to the occurrence probability of the states. That is, the number of a configuration generated is proportional to its weight in the system. So, we measure the observable quantity Q when the configuration is updated to a new one, and suppose the measurement performed M times, then the expectation value

Q is now written

Q =

_M

i=1Q_i

M . (3.10)

In our simulation scheme, we flip a spin at a time to generate a new configuration. The difference between the old one and the new one is small.

In other words, if we measure the observable quantity Q in each update, the measurement may be correlated. Then it is not a true Markov process.

To solve this problem, we do not measure the observable quantity Q in each update. Instead, we perform the measurement after one Monte Carlo step. We deﬁne one Monte Carlo step as attempting to ﬂip spins n times.

The number n is the length of the spin chain.

In order to minimize the noise caused by the randomness in the Monte Carlo simulation, we use the blocking method called the binning procedure.

First, we perform m Monte Carlo steps and perform the measurement after each Monte Carlo step. Second, we divide the measurements into m/b groups.

Each group contains b measurement results, and the cumulative results are taken average over b. This step is called making a bin. Finally, the observable quantities are obtained by taking the bin average over m/b. The binning procedure is illustrated in Fig. 3.1.

ܳ௕௜௡͓ଵ ܳ௕௜௡͓ଶ ܳ௕௜௡͓ሺ௠Ȁ௕ሻ

ܳ௕௜௡ൌσ௕ ܳ௜

௜ୀଵ

ܾ m measurements

ۃܳۄ ൌܳ௕௜௡͓ଵ൅ ܳ௕௜௡͓ଶ൅ ڮ ൅ ܳ௕௜௡͓ሺ௠Ȁ௕ሻ

݉Ȁܾ

b measurements

Ξ

Figure 3.1: A sketch for the binning procedure.

(25)

3.2 Stochastic optimization

To obtain an approximate ground state wave function, our goal is to ﬁnd the matrix elements that minimize the expectation value of the energy E =H

by the derivatives of the energy with respect to the matrix elements. We illustrate the optimization scheme through a S = ¹₂ translationally invariant periodic spin system with n spins.

The ﬁrst step is ﬁnding an appropriate form of E for Monte Carlo sam- pling. The wave function of a spin chain can be written

|Ψ =

s₁,s₂,...,sn

tr(A(s₁)A(s₂) . . . A(s_n))|s₁s₂. . . s_n =

S

F (S)|S, (3.11)

where the spins s_i =±1 are the eigenvalues of σi^z, |S = |s1, s₂, . . . , s_n, and F (S) is the coeﬃcient for state |S; A(±1) are two D × D general matrices, and they are taken to be real. The expectation value of energy can be expressed as

E =

S,SF (S)F (S)S|H|S

SF²(S) =

SF²(S)E(S)

SF²(S) =E(S), (3.12) where E(S) =

S F(S)

F(S)S|H|S. Identifying F²(S) as the weight of the configuration, we can use Metropolis algorithm, with the transition proba- bility t(S → S) = min{1,^F_F²2^(S(S)⁾}, to generate successive spin configurations by flipping a spin at a time. Through measuring E(S) in the successive gen- erated spin configurations, we then average over these measurements using the binning procedure to get E =E(S).

Second, the derivatives of the energy with respect to the matrix elements are also required. Taking derivatives directly, we have

∂E

∂a^s_ij = ∂

∂a^s_ij[

S,SF (S)F (S)S|H|S

SF²(S) ]

= 2 1 F (S)

∂F (S)

∂a^s_ij E(S) − 2 1 F (S)

∂F (S)

∂a^s_ij E(S) (3.13) Calculating ^∂F_∂a^(S)s

ij seems to be a formidable task; however, if we write out

(26)

those indices of the matrices, we can get

∂F (S)

∂a^s_ij = ∂

∂a^s_ijtr(A(s₁)A(s₂) . . . A(s_n))

=

n k=1

δs,s_k

∂

∂a^s_ij[a^s_α¹₁_α₂a^s_α²₂_α₃· · · a^sα^k−1_k−1α_ka^s_α^k

kα_k+1a^s_α^k+1_k+1_α_k+2· · · a^sαⁿn⁻¹−1αna^s_αⁿ_n_α₁]

=

n k=1

δ_s,s_ka^s_jα^k⁺¹

k+2a^s_α^k⁺²

k+2αk+3· · · a^s_αⁿ_n⁻¹₋₁_α_na^s_αⁿ_n_α₁a^s_α¹₁_α₂a^s_α²₂_α₃· · · a^s_α^k_k⁻¹₋₁_i

=

n k=1

δ_s,s_k[A(s_k₊₁)A(s_k₊₂)· · · A(sn)A(s₁)· · · A(sk−2)A(s_k₋₁)]_ji. (3.14) Deﬁne the matrices

M (k) = A(s_k₊₁)A(s_k₊₂)· · · A(sn)A(s₁)· · · A(sk−2)A(s_k₋₁), (3.15) the derivatives of the weight is rewritten

∂F (S)

∂a^s_ij =

n k=1

δ_s,s_kM_ji(k). (3.16)

In order to minimize the energy, we have to calaulate F (S), F (S) and M (k) during the update and the measurement. These terms involves many matrix multiplication and are time consuming. Here we implement a scheme to reduce the computational eﬀort of calculating these terms for the update and the measurement. Noticing the form of M (k), we introduce two more matrices L(k) = A(s_k)A(s_k₊₁)· · · A(sn) and R(k) = A(s₁)A(s₂)· · · A(sk) such that M (k) = L(k + 1)R(k− 1). Also, we deﬁneL(N + 1) = R(0) = I.

The scheme goes as follows.

1. Calculate and store L(2), L(3), . . . , L(n) based on a random initail con- ﬁguration or the conﬁguration from the previous run.

2. Start from s₁, we flip the spins sequentially to generate successive spin configurations. Each flip is based on Metropolis probability, t(S → S) = min{1,^F_F²2^(S(S)⁾}. Using the cyclic property of the trace, we calcu- late F (S) by F (S_k) = tr(A(−sk)M (k)).

3. After each attempt to ﬂip the spin s_k, L(k + 1) is no longer needed, so we store R(k) = R(k− 1)A(sk) in its place for future use.

(27)

4. After a sweep of spin updates, we start measurement from the spin s_n. Now the matrices R(k) are all generated and stroed. Traverse the spin chain from k = n to 1 for measuring observable quantities, and the matrices L(k) are generated and stored.

5. Go back to 2 until the energy is converged.

After b measurements, we obtain a bin measurement of energy and the derivatives. We update the matrix elements after a bin is made. The bin measurement are used to update the matrix elements a^s_ij:

a^s_ij → a^sij − q(p) · r^sij· sign(∂E

∂a^s_ij), (3.17) where r^s_ij ∈ [0, 1) is random and q(p) is the maximum change which decreases as a function of some number p. Each parameter is changed according to the sign of the derivative, but they are changed independently with a random but well bounded number. After each update of all the matrix elements, the matrices are normalized so that the largest element |a^s_ij|=1. In our simu- lation, we use the form q(p) = q₀p^−α, with q₀ = 0.05 – 0.1 and α = 0.7 – 0.8.

As mentioned in section 3.1.2, the real output of the expextation value is the average over bins. Let p be the number of outputs; that is, q(p) de- creases with output numbers. Furthermore, when the energy approaches the minimum, the derivatives will become smaller, and the noise from randomness may aﬀect the measurement of the derivatives. So, in order to minimize the noise, we use the form b = b₀p to increase measurements in a bin with b₀ = 100 – 200 and 20 – 40 bins are averaged to make an output. Also, if we perform the above procedure to obtain optimized matrices, then, by using the optimized matrices as initial values of the matrices, we restart the above optimization but with a smaller q₀. Repeating this process produces better convergence.

A typical optimization process is shown in Fig. 3.2. Each point is an output with 20 bins and b₀ = 100. We notice that the energy decreased exponentially, and the variance of the energy was also reduced quickly during the optimization process. When using the matrices obtained in the previous run as the initial matrices, and with a smaller q₀, the energy reached a lower value.

The computaional cost of the above optimization scheme scales as N D³, while DMRG scales as N D⁶. Besides, it can generalized to the open boundary condition which scales as N D². The generalization is straightforward. We discuss the scheme in the next section.