Cycle-symmetric matrices and convergent neural networks

(1)

Cycle-symmetric matrices and convergent neural networks

Chih-Wen Shih

∗

, Chih-Wen Weng

Department of Applied Mathematics, National Chiao Tung University, Hsinchu, Taiwan, ROC

Received 30 November 1999; received in revised form 20 June 2000; accepted 21 June 2000 Communicated by C.K.R.T. Jones

Abstract

This work investigates a class of neural networks with cycle-symmetric connection strength. We shall show that, by changing the coordinates, the convergence of dynamics by Fiedler and Gedeon [Physica D 111 (1998) 288] is equivalent to the classical results. This presentation also addresses the extension of the convergence theorem to other classes of signal functions with saturations. In particular, the result of Cohen and Grossberg [IEEE Trans. Syst. Man Cybernet. SMC-13 (1983) 815] is recast and extended with a more concise verification. © 2000 Elsevier Science B.V. All rights reserved.

MSC: 34C37; 68T10; 92B20

Keywords: Neural networks; Cycle-symmetric matrix; Lyapunov function; Convergence of dynamics

1. Introduction

The notion of neural network, in addition to biological modeling, has been applied to various scientific areas such as circuit architecture and numerical computations. In designing a neural network, it is usually of prime importance to guarantee the convergence of the corresponding dynamical system, cf. [1–4,6,7,9]. The convergence of dynamics refers to every solution tending to a stationary solution as time goes to positive infinity. Such a convergence is often concluded by constructing a Lyapunov function and then applying LaSalle’s invariance principle. Classical results on the construction of the Lyapunov function require the symmetry for the matrix of connection strength between neurons. For example, the works by Cohen and Grossberg [4] and by Chua and Yang [1] made this assumption. A significant progress has been made by Fiedler and Gedeon [5]. They successfully extended the Lyapunov function, hence the convergence of dynamics, to more general matrices of connection strength. The following preparation is needed to define such matrices.

For an undirected graphG without loops or multiple edges, a path is defined as a sequence of vertices v1v2· · · vk,

k ≥ 1, where vivi+1is an edge ofG for each i ∈ {1, . . . , k − 1} and there are no repeated vertices except possibly the first and the last. By a cycle, we mean a closed path of length greater than or equal to three, that is,v1= vkand

k ≥ 3. Let B denote an n × n matrix with entries βij∈ R, i, j ∈ {1, 2, . . . , n}. If βij6= 0 whenever βji6= 0, then a

∗_{Corresponding author. Fax:}_{+886-3-572-4679}

E-mail address: [email protected] (C.-W. Shih).

(2)

graph withn vertices can be defined from B. Indeed, for i 6= j, an unordered pair (i, j) with βij6= 0 is an edge of this graph. Consider the class of matrices with entriesβijsatisfying

(H1) βijβji> 0 if βij6= 0,

(H2) Q

Cβik=

Q

Cβki, along every cycleC,

whereQdenotes the product. Notably, condition(H1) means that the entries of B are sign-symmetric. Such class

of matrices, called cycle-symmetric herein, has been investigated in [11,12]. In fact, these matrices are characterized as matrices which are similar to symmetric matrices by real diagonal matrices. Restated, ifB satisfies (H1) and

(H2), then there exists an invertible diagonal matrix P such that PBP−1 is a symmetric matrix. Notably, this

characterization theorem was first obtained by Parter and Youngs [12]. Maybee [11] then considered a class of so-called combinatorially symmetric matrices (B = [βij] withβji 6= 0 if βij6= 0) and weakened the condition (H1). A matrix is called pseudosymmetric therein if it is similar to a symmetric matrix by a real diagonal matrix. However,

(H1) and (H2) are the basic conditions for a matrix to be pseudosymmetric.

Fiedler and Gedeon [5] generalized the Lyapunov function for the neural network proposed in [4] to accommodate the network with the connection strength satisfying(H1) and (H2). The condition (H1) was further weakened in [6].

The studies in [5,6] thus concluded the convergence of dynamics for the system with a larger class of connection strength.

The first goal of this paper is to show that, with the characterization of the cycle-symmetric matrices, a change of coordinates can transform the system to a similar system, but with symmetric connection strength. Therefore, the convergence results in [5] are equivalent to the classical ones. This approach answers the question raised in [6] (also mentioned in [5]), which is whether the characterization theorems in [11,12] can be applied directly to prove the convergence theorem. The new treatment in this presentation is considered a more natural generalization of the classical results, since, for example, symmetric matrices are easier to handle in various related computations.

Our second objective in this investigation is to extend the convergence theory to other signal functions, in particular, the signal functions with saturations. Such functions have been used as output functions in the cellular neural networks [1–3]. Similar signal functions have also been considered in [4], where the transition of zero slope to positive slope in the signal functions relates to the notion of inhibitory signal threshold. Based on our previous technique of changing coordinates, we shall extend the convergence of dynamics to the system with more general saturated signal functions. This work not only provides an explicit formulation of these signal functions but also develops a new concise treatment for the proof of convergence. Our first step is to partition the phase space as the configurations of the signal functions are respected. The convergence of dynamics is then established by constructing a global Lyapunov function as well as certain regional Lyapunov functions. The latter ones are naturally incorporated with the existence for the equilibrium of the system and the partitioning of phase space. This approach is more straightforward than the one in [4] and is more general than the one in [10]. This investigation further explores the intrinsic structures of the model equations discussed in this presentation. Indeed, for example, for an arbitrary dynamical system with a global Lyapunov function, the existence of a regional Lyapunov function on the set where the global Lyapunov function is constant is not automatically valid.

We shall present our results for strictly increasing and two-sided saturated signal functions in Section 2. Extension of the convergence theorem to more general saturated signal functions will be discussed in Section 3.

2. Main results

We consider the following system proposed by Cohen and Grossberg [4], and later investigated in [5,6], dx_i dt = ai(x)  γi(xi) − n X j=1 βijfj(xj)   , i = 1, 2, . . . , n, (2.1)

(3)

where x= (x1, x2, . . . , xn). Denote by Fi(x) the right-hand side of (2.1) and F = (F1, F2, . . . , Fn). The following

assumptions have been made in [5,6] in addition to(H1), (H2):

(H3) ai(x) ≥ 0 for all x ∈ Rnand everyi = 1, 2, . . . , n,

(H4) f_i0(ξ) > 0 for all ξ ∈ R and every i = 1, 2, . . . , n.

There are extra conditions which guarantee the dissipativeness, hence the existence of the global attractor, for the system (2.1).

(H5) All fi are bounded;ai(x) > 0 for all sufficiently large |x|; γi(xi)xi → −∞ as |xi| → ∞.

LetB = [βij] be a cycle-symmetric matrix, that is,B satisfies (H1) and (H2). By the theorem in [11,12], there exists

an invertible diagonal matrixP such that PBP−1= A with A = [αij], a symmetric matrix. Denote the diagonal entries ofP by p1, p2, . . . , pn, where everypi is nonzero. Set y= P x, that is, yi = pixi for eachi. Eq. (2.1) in

new variables is given by the following form: dy_i dt = piai(P −1_y₎  γi(p_i−1yi) − n X j=1 βijfj(p_j−1yj)   = ai(P−1y)  piγi(p_i−1yi) − n X j=1 piβijp_j−1pjfj(p−1_j yj)   = ˜ai(y)   ˜γi(yi) − n X j=1 αijf˜j(yj)   ,

where ˜a_i(y) = a_i(P−1y), ˜γ_i(y_i) = p_iγ_i(p_i−1y_i), and ˜f_i(y_i) = p_if_i(p_i−1y_i). Notice that ˜a_i satisfies (H3), ˜fi

satisfies(H4), and ˜ai, ˜fi, and ˜γi satisfy(H5). Therefore, the Lyapunov function

V (y) = − n X i=1    Z _y_i ˜γi(ξ) ˜fi0(ξ) dξ − 1 2 n X j=1 αijf˜i(yi) ˜fj(yj)   ,

which was proposed in [4] for symmetric connection strength, still holds here. We thus obtain the main theorem in [5].

Theorem 2.1. Assume(H1)–(H5). The dynamics of (2.1) are convergent if every equilibrium is isolated.

The second goal of this presentation is to extend the convergence of dynamics for (2.1) to other signal functionsf_i. In particular, we consider sigmoidalf_iwith some saturations. In this case, the slope off_ibecomes only nonnegative (compare with(H4)). These functions are described as follows.

(H0

4) Let{bi}n1, {ci}n1, {ui}1n, {vi}n1 be sequences of real numbers withbi < ci,ui < vi for each i. For i =

1, 2, . . . , n, let f_i = f_i(ξ_i) be a function which is continuous on R, increasing on [b_i, c_i],f_i(ξ_i) = v_i for all

ξi ≥ ci, andf_i(ξ_i) = u_ifor allξ_i ≤ b_i.

A typical functionf_i satisfying (H₄0) is depicted in Fig. 1. The phase space Rnfor the dynamical system generated by (2.1) can be decomposed into 3nregions, corresponding to the partitioning of the domains in definition of these sigmoidal functionsf_i. The following labeling and notations are used to describe these regions. Denote by N_nthe set of positive integers from 1 ton, and by ANn _{the set of all functions}_{σ : N}_n _{→ A, where A := {−1, 0, 1}. It}

follows that [ σ ∈AN_n Ωσ = Rn, where Ωσ := {x = {x_i} ∈ Rn| x_i ≥ c_iifσ_i = 1; x_i ≤ b_iifσ_i = −1; b_i < x_i < c_iifσ_i = 0}.

(4)

Fig. 1. Graph of signal functionf_iin(H₄0).

An illustration of the decomposition forn = 2 is provided in Fig. 2. Let Λe = {{σi} ∈ ANn| σi = −1 or 1},

Λm= {{σi} ∈ ANn| σi = 0 for some i ∈ Nnand|σj| = 1 for some j ∈ Nn}. These 3nregions can then be classified

into three categories:Ω_σ is called an exterior region ifσ ∈ Λe, a mixed region ifσ ∈ Λmand an interior region if

σi = 0 for all i ∈ Nn. Accordingly, there is only one interior region and it will be denoted byΩ0.

As a consequence, the equilibria for (2.1) can be classified into three types, according to their locations. An equilibrium¯x = { ¯x_i}n₁is called exterior if¯x lies in an exterior region, mixed if ¯x lies in a mixed region, and interior if¯x lies in the interior region.

With this classification, we elaborate on the existence for each type of the equilibria in the following. If substituting

{xi}n1by{ ¯xi}n1into the right-hand side of (2.1) yields zero andbi < ¯xi < cifor eachi ∈ Nn, then{ ¯xi}n1is an interior

equilibrium.

(2.1) restricted to an exterior regionΩ_σ, σ ∈ Λetakes the following form:

dx_i dt = ai(x)  γi(xi) − n X j=1 βijωj   , (2.2) where ωj = vj ifσ_j = 1, ω_j = u_j ifσ_j = −1. (2.3)

(5)

Thus,¯x = { ¯x_i}n₁is an exterior equilibrium of (2.1) inΩ_σ if it satisfies (2.2) as well as ¯x_i ≥ c_ifori with σ_i = 1 and

¯xi ≤ bi fori with σ_i = −1.

Consider a mixed regionΩ_σ,σ ∈ Λm. LetJ0 = {i ∈ Nn : σi = 0} and J1 = Nn\ J0. Fori ∈ J0, theith

component of the vector fieldF(x) in (2.1) restricted to Ω_σ becomes

F(x)i = ai(x)  γi(xi) − X j∈J0 βijfj(xj) − X j∈J1 βijωj   , (2.4) where ωj = vj ifσ_j = 1, ω_j = u_j ifσ_j = −1. (2.5)

Assume thata_i(x) > 0 for all x ∈ Rn. Suppose there exist real numbers{ ¯x_i}n₁with¯x = { ¯x_i}n₁such that substituting

{xi}n1by{ ¯xi}n1into (2.4) yields zero. Then, (2.4) also vanishes for x= {xi}n1withxi = ¯xiifi ∈ J0, and anyxi ≤ bi

ifσ_i = −1, as well as any x_i ≥ c_i ifσ_i = 1. Therefore, we have the following subsets of the phase space, which possesses certain invariant property. Namely,

Iσ = {x ∈ Rn| xi = ¯xiifi ∈ J0, xi ≤ biifσi = −1, xi ≥ ciifσi = 1}. (2.6)

An orbit starting onI_σremains onI_σbefore it enters the other regionsΩ_σ0neighboringΩ_σ. Note that an equilibrium inΩ_σ,σ ∈ Λm, must lie on such a subsetIσ. Indeed,¯x = { ¯xi}n₁is a mixed equilibrium inΩσ if the vector field in

(2.1) vanishes at¯x (the ith component of the vector field is as (2.4) for i ∈ J0), moreover,b_i < ¯x_i < c_i fori ∈ J0,

and¯x_i ≥ c_i fori ∈ J1withσi = 1 and ¯xi ≤ bi fori ∈ J1withσi = −1.

Now we consider (2.1) with symmetric connection strengthB and signal functions f_i satisfying(H₄0). First, let us construct a global Lyapunov function:

V (x) = −Xn i=1    Z _f_i_(x_i₎ γi(gi(ξ)) dξ −1₂ n X j=1 βijfi(xi)fj(xj)   , (2.7)

whereg_i : [u_i, v_i]→ [b_i, c_i] is defined byg_i(ξ) = (f_i|[_b_i_,c_i])−1(ξ), and (fi|[b_i,c_i])−1is the inverse function offi

restricted to [b_i, c_i]. If eachf_i is differentiable on R, then the derivative ofV along an orbit of (2.1) is

˙V (x) = −Xn i=1 ˙xifi0(xi)  γi(xi) − n X j=1 βijfj(xj)   (2.8) = − n X i=1 f_i0(xi)ai(x)  γi(xi) − n X j=1 βijfj(xj)   2 . (2.9)

The equality in (2.8) follows from the symmetry ofB = [βij] and the following observation. In the computation, (2.8) should only hold for the termγ_i(x_i) with x_i ∈ [b_i, c_i] according to the definition ofg_i. However, forx_i ≥ c_i orx_i ≤ b_i,f_i0(x_i) = 0. Thus, for x_iin these ranges, thei-term in the summationPn_i=1vanishes no matter what the terms in the bracket are. Sincef_i0(x_i) ≥ 0 for any x_i, ˙V (x) in (2.9) is less than or equal to zero.

If somef_i is not differentiable, an alternative computation yields the same result. Namely, consider

˙V (x) = lim sup

h→0+

1

(6)

whereF(x) is the vector field in (2.1), cf. [8]. The detailed computation is similar to the one in [10]. Let S be the set on whichV remains constant along an orbit of (2.1), that is,

S = {x ∈ Rn: ˙V (x) = 0}.

Then, the closure ofS can be represented by

¯S = (∪σ ∈ΛeΩσ) ∪ (∪Iσ) ∪ E0. (2.10)

Herein,∪_σ∈Λ_eΩ_σ is the union of all exterior regions,E0is the set of equilibria in the interior region, and∪I_σ is the union of the subsets in mixed regions, as discussed in (2.6), whenever they exist. We shall call each point (an equilibrium) ofE0, each of the exterior regionsΩ_σ, and each of theseI_σ, a component ofS.

Next, we introduce the regional Lyapunov functionV_σ for (2.1) restricted to each exterior regionΩ_σ or eachI_σ in a mixed region. Consider an exterior regionΩ_σ,σ ∈ Λe. Let

Vσ(x) = − n X i=1    Z _x_i γi(ξ) dξ − xi n X j=1 βijωj    , (2.11)

whereω_j is as defined in (2.3). The derivative of this function along a solution of (2.1) lying inΩ_σ is

˙Vσ(x) = − n X i=1 ˙xi  γi(xi) − n X j=1 βijωj   = −Xn i=1 ai(x)  γi(xi) − n X j=1 βijωj   2 ≤ 0.

The equality holds if and only ifa_i(x)[γ_i(x_i) −Pn_j=1βijωj]= 0 for every i ∈ Nn. That is, ˙Vσ(x) only vanishes at an exterior equilibrium x inΩ_σ.

SupposeI_σ lies in a mixed regionΩ_σ,σ ∈ Λm. Recall thatJ0= {i ∈ Nn :σi = 0} and J1= Nn\ J0and the

notations in (2.6). Let Vσ(x) = − X i∈J1    Z _x_i γi(ξ) dξ − xi X j∈J1 βijωj − xi X j∈J0 βijfj( ¯xj)   , (2.12)

whereω_j is as described in (2.5). It can be verified that ˙V_σ(x), the derivative of V_σ along a solution of (2.1) lying inI_σ, vanishes only at a mixed equilibrium inI_σ.

With the global Lyapunov functionV and these regional Lyapunov functions V_σ, we can then derive the following result. It extends Theorem 2.1 to the class of signal functionsf_i satisfying(H₄0).

Theorem 2.2. Assume (H1), (H2), (H40), (H5) and that ai(x) > 0 for all x ∈ Rn. (2.1) is convergent if every equilibrium is isolated.

Proof. By changing the coordinates, it suffices to consider (2.1) with symmetricB = [βij]. Notably, if there is an equilibrium in a mixed regionΩ_σ, then a subsetI_σ described in (2.6) exists and this equilibrium lies onI_σ. With the assumption that every equilibrium is isolated, the components of S are pairwise disjoint. Indeed, any two distinct exterior regions are disjoint. In addition, any two components belonging to two different regionsΩ_σ,

Ωσ0,σ 6= σ0, are disjoint, since there is aj ∈ N_nsuch thatx_j 6= ˜x_j for any x = (x1, . . . , x_n) ∈ Ω_σ and any

˜x = ( ˜x1, . . . , ˜xn) ∈ Ωσ0. Furthermore, the same argument justifies that any two components ofS belonging to the sameΩ_σ are disjoint. Consider an orbitφ(t, x0) and its ω-limit set, ω(φ(t, x0)). It follows from the existence of

(7)

lies in one component ofS. Let x∗∈ ω(φ(t, x0)). If x∗∈ E0, then x∗is already an equilibrium. Suppose x∗∈ Ω_σ,

σ ∈ Λe. Thenφ(t, x∗) ∈ Ωσ for allt since V (φ(t, x∗)) = V (x∗) for all t and V (φ(t, x∗)) decreases as φ(t, x∗)

leavesΩ_σ. By the existence of regional Lyapunov functionsV_σ(x), (2.11), it follows that x∗has to be an exterior equilibrium. The same argument holds forI_σ ⊂ Ω_σ,σ ∈ Λm. That is, if x∗ ∈ Iσ, then x∗ has to be a mixed

equilibrium inΩ_σ. It is also obvious that theω-limit set of φ(t, x0) consists of a single equilibrium. This completes

the proof.

Remark. It can be shown by Sard’s theorem that the equilibrium points of (2.1) are isolated for almost every matrix

of connection strengthB, with a mild assumption on the values of the signal functions at the inhibitory thresholds. The verification is similar to the one in [4].

3. More generalizations

Theorem 2.2 is valid for other signal functions with saturations. For example, similar arguments as the proof of Theorem 2.2 confirm the convergence of (2.1) with one-sided signal functionsf_i(as in Fig. 3). This class of signal functions fits the setting of suprathreshold and subthreshold variables in [4].

Our result can further be extended to stairway-like multi-saturated signal functions. Letm > 1 be an integer. Fori ∈ 1, 2, . . . , n, let each of {bi₁, c₁i, bi₂, ci₂, . . . , b_mi , ci_m} and {ui₀, ui₁, ui₂, . . . , ui_m} be a partition of R with

b1i < ci1 < bi2 < c2i < · · · < bmi < cmi andui0 < ui1 < ui2 < · · · < uim. For eachi = 1, 2, . . . , n, let fi be a

continuous function defined by

fi(ξ) =              ui 0 if − ∞ < ξ ≤ bi1, ui j ifcij ≤ ξ ≤ bij+1, j = 1, . . . , m − 1, increasing ifb_ji ≤ ξ ≤ ci_j, j = 1, . . . , m, ui m ifcim≤ ξ < ∞.

Such a signal function is demonstrated in Fig. 4. For eachi = 1, 2, . . . , n, let g_i : [ui₀, ui_m]→ ∪m_j=1[bi_j, ci_j) ∪ {c_mi } be a function defined byg_i(ξ) = (f_i|_[_bi j,cij)) −1_{(ξ) if ξ ∈ [u}i j, uij+1) for j = 1, 2, . . . , m − 1 and gi(uim) = cim, where(f_i|_[_bi j,cji))

−1_{is the inverse function of}_f_i _{restricted to [}_bi

j, cji). Then the function V in (2.7) is a global

Lyapunov function for (2.1). The computations in (2.8) and (2.9) remain valid by similar arguments following (2.9). Thus, the convergence theorem for (2.1) with such signal functions can be analogously concluded by establishing the associated regional Lyapunov functions.

(8)

Fig. 4. Multi-saturated signal function.

Finally, we note that it is not necessary for signal functionsf_i in (2.1) to have the same number of saturations to conclude the convergence of dynamics. Restated, the number of saturations can range from 0 to any positive integer

m + 1 and m can vary with i.

Acknowledgements

The authors are supported, in part, by the National Science Council of Taiwan, ROC. The authors would like to thank the referee for calling their attention to the results of Gedeon and Maybee.

References

[1] L.O. Chua, L. Yang, Cellular neural networks: theory, IEEE Trans. Circuits Syst. 35 (1988) 1257. [2] L.O. Chua, L. Yang, Cellular neural networks: applications, IEEE Trans. Circuits Syst. 35 (1988) 1273.

[3] L.O. Chua, CNN: A Paradigm for Complexity, World Scientific Series on Nonlinear Science, Series A, Vol. 31, World Scientific, Singapore, 1998.

[4] M.A. Cohen, S. Grossberg, Absolute stability of global pattern formulation and parallel memory storage by competitive neural networks, IEEE Trans. Syst. Man Cybernet. SMC-13 (1983) 815.

[5] B. Fiedler, T. Gedeon, A class of convergent neural network dynamics, Physica D 111 (1998) 288. [6] T. Gedeon, Structure and dynamics of artificial neural networks, Fields Inst. Commun. 21 (1999) 217. [7] S. Grossberg, Competition, decision, and concensus, J. Math. Anal. Appl. 66 (1978) 470.

[8] J. Hale, Ordinary Differential Equations, Krieger, Florida, 1980.

[9] J.J. Hopfield, Neurons with graded response have collective computational properties like those of two-state neurons, Proc. Natl. Acad. Sci. 81 (1984) 3088.

[10] S.S. Lin, C.W. Shih, Complete stability for standard cellular neural network, Int. J. Bifur. Chaos 9 (5) (1999) 909. [11] J. Maybee, Combinatorially symmetric matrices, Linear Algebra Appl. 8 (1974) 529.