Cycle-symmetric matrices and convergent neural networks
Chih-Wen Shih
∗, Chih-Wen Weng
Department of Applied Mathematics, National Chiao Tung University, Hsinchu, Taiwan, ROC
Received 30 November 1999; received in revised form 20 June 2000; accepted 21 June 2000 Communicated by C.K.R.T. Jones
Abstract
This work investigates a class of neural networks with cycle-symmetric connection strength. We shall show that, by changing the coordinates, the convergence of dynamics by Fiedler and Gedeon [Physica D 111 (1998) 288] is equivalent to the classical results. This presentation also addresses the extension of the convergence theorem to other classes of signal functions with saturations. In particular, the result of Cohen and Grossberg [IEEE Trans. Syst. Man Cybernet. SMC-13 (1983) 815] is recast and extended with a more concise verification. © 2000 Elsevier Science B.V. All rights reserved.
MSC: 34C37; 68T10; 92B20
Keywords: Neural networks; Cycle-symmetric matrix; Lyapunov function; Convergence of dynamics
1. Introduction
The notion of neural network, in addition to biological modeling, has been applied to various scientific areas such as circuit architecture and numerical computations. In designing a neural network, it is usually of prime importance to guarantee the convergence of the corresponding dynamical system, cf. [1–4,6,7,9]. The convergence of dynamics refers to every solution tending to a stationary solution as time goes to positive infinity. Such a convergence is often concluded by constructing a Lyapunov function and then applying LaSalle’s invariance principle. Classical results on the construction of the Lyapunov function require the symmetry for the matrix of connection strength between neurons. For example, the works by Cohen and Grossberg [4] and by Chua and Yang [1] made this assumption. A significant progress has been made by Fiedler and Gedeon [5]. They successfully extended the Lyapunov function, hence the convergence of dynamics, to more general matrices of connection strength. The following preparation is needed to define such matrices.
For an undirected graphG without loops or multiple edges, a path is defined as a sequence of vertices v1v2· · · vk,
k ≥ 1, where vivi+1is an edge ofG for each i ∈ {1, . . . , k − 1} and there are no repeated vertices except possibly the first and the last. By a cycle, we mean a closed path of length greater than or equal to three, that is,v1= vkand
k ≥ 3. Let B denote an n × n matrix with entries βij∈ R, i, j ∈ {1, 2, . . . , n}. If βij6= 0 whenever βji6= 0, then a
∗Corresponding author. Fax:+886-3-572-4679
E-mail address: [email protected] (C.-W. Shih).
0167-2789/00/$ – see front matter © 2000 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 2 7 8 9 ( 0 0 ) 0 0 1 3 4 - 2
graph withn vertices can be defined from B. Indeed, for i 6= j, an unordered pair (i, j) with βij6= 0 is an edge of this graph. Consider the class of matrices with entriesβijsatisfying
(H1) βijβji> 0 if βij6= 0,
(H2) Q
Cβik=
Q
Cβki, along every cycleC,
whereQdenotes the product. Notably, condition(H1) means that the entries of B are sign-symmetric. Such class
of matrices, called cycle-symmetric herein, has been investigated in [11,12]. In fact, these matrices are characterized as matrices which are similar to symmetric matrices by real diagonal matrices. Restated, ifB satisfies (H1) and
(H2), then there exists an invertible diagonal matrix P such that PBP−1 is a symmetric matrix. Notably, this
characterization theorem was first obtained by Parter and Youngs [12]. Maybee [11] then considered a class of so-called combinatorially symmetric matrices (B = [βij] withβji 6= 0 if βij6= 0) and weakened the condition (H1). A matrix is called pseudosymmetric therein if it is similar to a symmetric matrix by a real diagonal matrix. However,
(H1) and (H2) are the basic conditions for a matrix to be pseudosymmetric.
Fiedler and Gedeon [5] generalized the Lyapunov function for the neural network proposed in [4] to accommodate the network with the connection strength satisfying(H1) and (H2). The condition (H1) was further weakened in [6].
The studies in [5,6] thus concluded the convergence of dynamics for the system with a larger class of connection strength.
The first goal of this paper is to show that, with the characterization of the cycle-symmetric matrices, a change of coordinates can transform the system to a similar system, but with symmetric connection strength. Therefore, the convergence results in [5] are equivalent to the classical ones. This approach answers the question raised in [6] (also mentioned in [5]), which is whether the characterization theorems in [11,12] can be applied directly to prove the convergence theorem. The new treatment in this presentation is considered a more natural generalization of the classical results, since, for example, symmetric matrices are easier to handle in various related computations.
Our second objective in this investigation is to extend the convergence theory to other signal functions, in particular, the signal functions with saturations. Such functions have been used as output functions in the cellular neural networks [1–3]. Similar signal functions have also been considered in [4], where the transition of zero slope to positive slope in the signal functions relates to the notion of inhibitory signal threshold. Based on our previous technique of changing coordinates, we shall extend the convergence of dynamics to the system with more general saturated signal functions. This work not only provides an explicit formulation of these signal functions but also develops a new concise treatment for the proof of convergence. Our first step is to partition the phase space as the configurations of the signal functions are respected. The convergence of dynamics is then established by constructing a global Lyapunov function as well as certain regional Lyapunov functions. The latter ones are naturally incorporated with the existence for the equilibrium of the system and the partitioning of phase space. This approach is more straightforward than the one in [4] and is more general than the one in [10]. This investigation further explores the intrinsic structures of the model equations discussed in this presentation. Indeed, for example, for an arbitrary dynamical system with a global Lyapunov function, the existence of a regional Lyapunov function on the set where the global Lyapunov function is constant is not automatically valid.
We shall present our results for strictly increasing and two-sided saturated signal functions in Section 2. Extension of the convergence theorem to more general saturated signal functions will be discussed in Section 3.
2. Main results
We consider the following system proposed by Cohen and Grossberg [4], and later investigated in [5,6], dxi dt = ai(x) γi(xi) − n X j=1 βijfj(xj) , i = 1, 2, . . . , n, (2.1)
where x= (x1, x2, . . . , xn). Denote by Fi(x) the right-hand side of (2.1) and F = (F1, F2, . . . , Fn). The following
assumptions have been made in [5,6] in addition to(H1), (H2):
(H3) ai(x) ≥ 0 for all x ∈ Rnand everyi = 1, 2, . . . , n,
(H4) fi0(ξ) > 0 for all ξ ∈ R and every i = 1, 2, . . . , n.
There are extra conditions which guarantee the dissipativeness, hence the existence of the global attractor, for the system (2.1).
(H5) All fi are bounded;ai(x) > 0 for all sufficiently large |x|; γi(xi)xi → −∞ as |xi| → ∞.
LetB = [βij] be a cycle-symmetric matrix, that is,B satisfies (H1) and (H2). By the theorem in [11,12], there exists
an invertible diagonal matrixP such that PBP−1= A with A = [αij], a symmetric matrix. Denote the diagonal entries ofP by p1, p2, . . . , pn, where everypi is nonzero. Set y= P x, that is, yi = pixi for eachi. Eq. (2.1) in
new variables is given by the following form: dyi dt = piai(P −1y) γi(pi−1yi) − n X j=1 βijfj(pj−1yj) = ai(P−1y) piγi(pi−1yi) − n X j=1 piβijpj−1pjfj(p−1j yj) = ˜ai(y) ˜γi(yi) − n X j=1 αijf˜j(yj) ,
where ˜ai(y) = ai(P−1y), ˜γi(yi) = piγi(pi−1yi), and ˜fi(yi) = pifi(pi−1yi). Notice that ˜ai satisfies (H3), ˜fi
satisfies(H4), and ˜ai, ˜fi, and ˜γi satisfy(H5). Therefore, the Lyapunov function
V (y) = − n X i=1 Z yi ˜γi(ξ) ˜fi0(ξ) dξ − 1 2 n X j=1 αijf˜i(yi) ˜fj(yj) ,
which was proposed in [4] for symmetric connection strength, still holds here. We thus obtain the main theorem in [5].
Theorem 2.1. Assume(H1)–(H5). The dynamics of (2.1) are convergent if every equilibrium is isolated.
The second goal of this presentation is to extend the convergence of dynamics for (2.1) to other signal functionsfi. In particular, we consider sigmoidalfiwith some saturations. In this case, the slope offibecomes only nonnegative (compare with(H4)). These functions are described as follows.
(H0
4) Let{bi}n1, {ci}n1, {ui}1n, {vi}n1 be sequences of real numbers withbi < ci,ui < vi for each i. For i =
1, 2, . . . , n, let fi = fi(ξi) be a function which is continuous on R, increasing on [bi, ci],fi(ξi) = vi for all
ξi ≥ ci, andfi(ξi) = uifor allξi ≤ bi.
A typical functionfi satisfying (H40) is depicted in Fig. 1. The phase space Rnfor the dynamical system generated by (2.1) can be decomposed into 3nregions, corresponding to the partitioning of the domains in definition of these sigmoidal functionsfi. The following labeling and notations are used to describe these regions. Denote by Nnthe set of positive integers from 1 ton, and by ANn the set of all functionsσ : Nn → A, where A := {−1, 0, 1}. It
follows that [ σ ∈ANn Ωσ = Rn, where Ωσ := {x = {xi} ∈ Rn| xi ≥ ciifσi = 1; xi ≤ biifσi = −1; bi < xi < ciifσi = 0}.
Fig. 1. Graph of signal functionfiin(H40).
An illustration of the decomposition forn = 2 is provided in Fig. 2. Let Λe = {{σi} ∈ ANn| σi = −1 or 1},
Λm= {{σi} ∈ ANn| σi = 0 for some i ∈ Nnand|σj| = 1 for some j ∈ Nn}. These 3nregions can then be classified
into three categories:Ωσ is called an exterior region ifσ ∈ Λe, a mixed region ifσ ∈ Λmand an interior region if
σi = 0 for all i ∈ Nn. Accordingly, there is only one interior region and it will be denoted byΩ0.
As a consequence, the equilibria for (2.1) can be classified into three types, according to their locations. An equilibrium¯x = { ¯xi}n1is called exterior if¯x lies in an exterior region, mixed if ¯x lies in a mixed region, and interior if¯x lies in the interior region.
With this classification, we elaborate on the existence for each type of the equilibria in the following. If substituting
{xi}n1by{ ¯xi}n1into the right-hand side of (2.1) yields zero andbi < ¯xi < cifor eachi ∈ Nn, then{ ¯xi}n1is an interior
equilibrium.
(2.1) restricted to an exterior regionΩσ, σ ∈ Λetakes the following form:
dxi dt = ai(x) γi(xi) − n X j=1 βijωj , (2.2) where ωj = vj ifσj = 1, ωj = uj ifσj = −1. (2.3)
Thus,¯x = { ¯xi}n1is an exterior equilibrium of (2.1) inΩσ if it satisfies (2.2) as well as ¯xi ≥ cifori with σi = 1 and
¯xi ≤ bi fori with σi = −1.
Consider a mixed regionΩσ,σ ∈ Λm. LetJ0 = {i ∈ Nn : σi = 0} and J1 = Nn\ J0. Fori ∈ J0, theith
component of the vector fieldF(x) in (2.1) restricted to Ωσ becomes
F(x)i = ai(x) γi(xi) − X j∈J0 βijfj(xj) − X j∈J1 βijωj , (2.4) where ωj = vj ifσj = 1, ωj = uj ifσj = −1. (2.5)
Assume thatai(x) > 0 for all x ∈ Rn. Suppose there exist real numbers{ ¯xi}n1with¯x = { ¯xi}n1such that substituting
{xi}n1by{ ¯xi}n1into (2.4) yields zero. Then, (2.4) also vanishes for x= {xi}n1withxi = ¯xiifi ∈ J0, and anyxi ≤ bi
ifσi = −1, as well as any xi ≥ ci ifσi = 1. Therefore, we have the following subsets of the phase space, which possesses certain invariant property. Namely,
Iσ = {x ∈ Rn| xi = ¯xiifi ∈ J0, xi ≤ biifσi = −1, xi ≥ ciifσi = 1}. (2.6)
An orbit starting onIσremains onIσbefore it enters the other regionsΩσ0neighboringΩσ. Note that an equilibrium inΩσ,σ ∈ Λm, must lie on such a subsetIσ. Indeed,¯x = { ¯xi}n1is a mixed equilibrium inΩσ if the vector field in
(2.1) vanishes at¯x (the ith component of the vector field is as (2.4) for i ∈ J0), moreover,bi < ¯xi < ci fori ∈ J0,
and¯xi ≥ ci fori ∈ J1withσi = 1 and ¯xi ≤ bi fori ∈ J1withσi = −1.
Now we consider (2.1) with symmetric connection strengthB and signal functions fi satisfying(H40). First, let us construct a global Lyapunov function:
V (x) = −Xn i=1 Z fi(xi) γi(gi(ξ)) dξ −12 n X j=1 βijfi(xi)fj(xj) , (2.7)
wheregi : [ui, vi]→ [bi, ci] is defined bygi(ξ) = (fi|[bi,ci])−1(ξ), and (fi|[bi,ci])−1is the inverse function offi
restricted to [bi, ci]. If eachfi is differentiable on R, then the derivative ofV along an orbit of (2.1) is
˙V (x) = −Xn i=1 ˙xifi0(xi) γi(xi) − n X j=1 βijfj(xj) (2.8) = − n X i=1 fi0(xi)ai(x) γi(xi) − n X j=1 βijfj(xj) 2 . (2.9)
The equality in (2.8) follows from the symmetry ofB = [βij] and the following observation. In the computation, (2.8) should only hold for the termγi(xi) with xi ∈ [bi, ci] according to the definition ofgi. However, forxi ≥ ci orxi ≤ bi,fi0(xi) = 0. Thus, for xiin these ranges, thei-term in the summationPni=1vanishes no matter what the terms in the bracket are. Sincefi0(xi) ≥ 0 for any xi, ˙V (x) in (2.9) is less than or equal to zero.
If somefi is not differentiable, an alternative computation yields the same result. Namely, consider
˙V (x) = lim sup
h→0+
1
whereF(x) is the vector field in (2.1), cf. [8]. The detailed computation is similar to the one in [10]. Let S be the set on whichV remains constant along an orbit of (2.1), that is,
S = {x ∈ Rn: ˙V (x) = 0}.
Then, the closure ofS can be represented by
¯S = (∪σ ∈ΛeΩσ) ∪ (∪Iσ) ∪ E0. (2.10)
Herein,∪σ∈ΛeΩσ is the union of all exterior regions,E0is the set of equilibria in the interior region, and∪Iσ is the union of the subsets in mixed regions, as discussed in (2.6), whenever they exist. We shall call each point (an equilibrium) ofE0, each of the exterior regionsΩσ, and each of theseIσ, a component ofS.
Next, we introduce the regional Lyapunov functionVσ for (2.1) restricted to each exterior regionΩσ or eachIσ in a mixed region. Consider an exterior regionΩσ,σ ∈ Λe. Let
Vσ(x) = − n X i=1 Z xi γi(ξ) dξ − xi n X j=1 βijωj , (2.11)
whereωj is as defined in (2.3). The derivative of this function along a solution of (2.1) lying inΩσ is
˙Vσ(x) = − n X i=1 ˙xi γi(xi) − n X j=1 βijωj = −Xn i=1 ai(x) γi(xi) − n X j=1 βijωj 2 ≤ 0.
The equality holds if and only ifai(x)[γi(xi) −Pnj=1βijωj]= 0 for every i ∈ Nn. That is, ˙Vσ(x) only vanishes at an exterior equilibrium x inΩσ.
SupposeIσ lies in a mixed regionΩσ,σ ∈ Λm. Recall thatJ0= {i ∈ Nn :σi = 0} and J1= Nn\ J0and the
notations in (2.6). Let Vσ(x) = − X i∈J1 Z xi γi(ξ) dξ − xi X j∈J1 βijωj − xi X j∈J0 βijfj( ¯xj) , (2.12)
whereωj is as described in (2.5). It can be verified that ˙Vσ(x), the derivative of Vσ along a solution of (2.1) lying inIσ, vanishes only at a mixed equilibrium inIσ.
With the global Lyapunov functionV and these regional Lyapunov functions Vσ, we can then derive the following result. It extends Theorem 2.1 to the class of signal functionsfi satisfying(H40).
Theorem 2.2. Assume (H1), (H2), (H40), (H5) and that ai(x) > 0 for all x ∈ Rn. (2.1) is convergent if every equilibrium is isolated.
Proof. By changing the coordinates, it suffices to consider (2.1) with symmetricB = [βij]. Notably, if there is an equilibrium in a mixed regionΩσ, then a subsetIσ described in (2.6) exists and this equilibrium lies onIσ. With the assumption that every equilibrium is isolated, the components of S are pairwise disjoint. Indeed, any two distinct exterior regions are disjoint. In addition, any two components belonging to two different regionsΩσ,
Ωσ0,σ 6= σ0, are disjoint, since there is aj ∈ Nnsuch thatxj 6= ˜xj for any x = (x1, . . . , xn) ∈ Ωσ and any
˜x = ( ˜x1, . . . , ˜xn) ∈ Ωσ0. Furthermore, the same argument justifies that any two components ofS belonging to the sameΩσ are disjoint. Consider an orbitφ(t, x0) and its ω-limit set, ω(φ(t, x0)). It follows from the existence of
lies in one component ofS. Let x∗∈ ω(φ(t, x0)). If x∗∈ E0, then x∗is already an equilibrium. Suppose x∗∈ Ωσ,
σ ∈ Λe. Thenφ(t, x∗) ∈ Ωσ for allt since V (φ(t, x∗)) = V (x∗) for all t and V (φ(t, x∗)) decreases as φ(t, x∗)
leavesΩσ. By the existence of regional Lyapunov functionsVσ(x), (2.11), it follows that x∗has to be an exterior equilibrium. The same argument holds forIσ ⊂ Ωσ,σ ∈ Λm. That is, if x∗ ∈ Iσ, then x∗ has to be a mixed
equilibrium inΩσ. It is also obvious that theω-limit set of φ(t, x0) consists of a single equilibrium. This completes
the proof.
Remark. It can be shown by Sard’s theorem that the equilibrium points of (2.1) are isolated for almost every matrix
of connection strengthB, with a mild assumption on the values of the signal functions at the inhibitory thresholds. The verification is similar to the one in [4].
3. More generalizations
Theorem 2.2 is valid for other signal functions with saturations. For example, similar arguments as the proof of Theorem 2.2 confirm the convergence of (2.1) with one-sided signal functionsfi(as in Fig. 3). This class of signal functions fits the setting of suprathreshold and subthreshold variables in [4].
Our result can further be extended to stairway-like multi-saturated signal functions. Letm > 1 be an integer. Fori ∈ 1, 2, . . . , n, let each of {bi1, c1i, bi2, ci2, . . . , bmi , cim} and {ui0, ui1, ui2, . . . , uim} be a partition of R with
b1i < ci1 < bi2 < c2i < · · · < bmi < cmi andui0 < ui1 < ui2 < · · · < uim. For eachi = 1, 2, . . . , n, let fi be a
continuous function defined by
fi(ξ) = ui 0 if − ∞ < ξ ≤ bi1, ui j ifcij ≤ ξ ≤ bij+1, j = 1, . . . , m − 1, increasing ifbji ≤ ξ ≤ cij, j = 1, . . . , m, ui m ifcim≤ ξ < ∞.
Such a signal function is demonstrated in Fig. 4. For eachi = 1, 2, . . . , n, let gi : [ui0, uim]→ ∪mj=1[bij, cij) ∪ {cmi } be a function defined bygi(ξ) = (fi|[bi j,cij)) −1(ξ) if ξ ∈ [ui j, uij+1) for j = 1, 2, . . . , m − 1 and gi(uim) = cim, where(fi|[bi j,cji))
−1is the inverse function offi restricted to [bi
j, cji). Then the function V in (2.7) is a global
Lyapunov function for (2.1). The computations in (2.8) and (2.9) remain valid by similar arguments following (2.9). Thus, the convergence theorem for (2.1) with such signal functions can be analogously concluded by establishing the associated regional Lyapunov functions.
Fig. 4. Multi-saturated signal function.
Finally, we note that it is not necessary for signal functionsfi in (2.1) to have the same number of saturations to conclude the convergence of dynamics. Restated, the number of saturations can range from 0 to any positive integer
m + 1 and m can vary with i.
Acknowledgements
The authors are supported, in part, by the National Science Council of Taiwan, ROC. The authors would like to thank the referee for calling their attention to the results of Gedeon and Maybee.
References
[1] L.O. Chua, L. Yang, Cellular neural networks: theory, IEEE Trans. Circuits Syst. 35 (1988) 1257. [2] L.O. Chua, L. Yang, Cellular neural networks: applications, IEEE Trans. Circuits Syst. 35 (1988) 1273.
[3] L.O. Chua, CNN: A Paradigm for Complexity, World Scientific Series on Nonlinear Science, Series A, Vol. 31, World Scientific, Singapore, 1998.
[4] M.A. Cohen, S. Grossberg, Absolute stability of global pattern formulation and parallel memory storage by competitive neural networks, IEEE Trans. Syst. Man Cybernet. SMC-13 (1983) 815.
[5] B. Fiedler, T. Gedeon, A class of convergent neural network dynamics, Physica D 111 (1998) 288. [6] T. Gedeon, Structure and dynamics of artificial neural networks, Fields Inst. Commun. 21 (1999) 217. [7] S. Grossberg, Competition, decision, and concensus, J. Math. Anal. Appl. 66 (1978) 470.
[8] J. Hale, Ordinary Differential Equations, Krieger, Florida, 1980.
[9] J.J. Hopfield, Neurons with graded response have collective computational properties like those of two-state neurons, Proc. Natl. Acad. Sci. 81 (1984) 3088.
[10] S.S. Lin, C.W. Shih, Complete stability for standard cellular neural network, Int. J. Bifur. Chaos 9 (5) (1999) 909. [11] J. Maybee, Combinatorially symmetric matrices, Linear Algebra Appl. 8 (1974) 529.