Improved CMAC neural network control scheme

(1)

absolute values of the capacitive coupling coefficients of the calibration inpiits differ at each stage due to variable total capaci- tance between stages.

The measured results with a ramp signal f v o = 500Hz at the input are shown in Fig. 4b (only the 4th bit is shown). The power consumption of the present circuit was only 6 7 ~ W , when V,, =

2V andf,, == 10kHz. The measured results indicate that the circuit

operates properly, and due to the calibration the bit number can be increased.

0 IEE 1999

Electronics Letters Online No: 19990110

A. Rantala, S. Franssila, M. Aberg and P. Kuivalainen ( V T T Electronics. PO Box 11012, FIN-02044 VTT, Finland)

K. Kaski and J. Lampinen (Laboratory for Computational Engineering, Helsinki Uniiersity of Technology, Miestentie 3, PO Box 9400, FIN- 02015 HUT, Finland)

P. Kuivalainm: Also with Electron Physics Laboratory, Helsinki University of Technology, Otakaari 7A, FIN-021 50 Espoo, Finland

16 October 1998

References

1 SHIBATA. I-., and OHMI, T.: ‘A functional MOS transistor featuring gate-level weighted sum and threshold operations’, IEEE Trans. Electron Devices, 1992, 39, pp. 14441455

2 KOTANI. K., SHIBATA, T., and OHMI, T.: ‘Impact of high-precision processing on the functional enhancement of neuron-MOS integrated circuits’, IEICE Trans. Electron., 1996, E79 C, pp. 407- 414

Improved

CMAC

neural network control

scheme

C.C. Lin and F.C. Chen

The cerebellar model articulation controller (CMAC) neural netwsrk control scheme is a powerful tool for practical real-time nonlinear control applications. However, despite its initial outstanding performance, the CMAC control system can suddenly diverge after a long period of stable tracking. The authors propsse a modification to the CMAC control scheme in order to impmve stability while preserving the original good performance. Introduction: The cerebellar model articulation controller (CMAC) was proposcd by Albus [l]. This neural network is capable of learning noclinear functions extremely quickly due to the nature of its weigh1 updating, so it is a powerful and practical tool for real-time control. Miller et al. [2, 31 proposed a feasible control scheme to combine the CMAC and a traditional controller (e.g. PID) for robot manipulator control. However, in our recent paper [4], it was shown that the Miller scheme can be unstable. The pur- pose of this Letter is to use the simple example in [4] to explain a modification that can be made to significantly improve the stability of the CMAC control system.

controller

‘d(k)l

f l z H T f i

controller U + Y(k+l)=G(Y(k),U(k))

m Fig. 1 CMAC control system

Controlprobkm and original scheme: The CMAC control system is shown in Fig. 1. The simple system to be controlled is

Y(k

+

1) = G ( Y ( k ) , U ( k ) ) (1) The control U(k) is generated by

ELECTR0,YlCS LETTERS 21st January 1999 Vol. 35

U ( k ) = U,(k)

+

U,(k) ( 2 )

where U$)

=

P( YJk) - Y(k)) is the output of the traditional con-

stant gain controller, P being the proportional gain and Y, being the desired system output. U,@)

=

CMAC( YJk

+

l), Y(k)) is the output of the CMAC module for the input vector ( Y J k

+

l), Y(k)). Then the CMAC is updated by the following gradient-type learning rule:

W z ( k + 1) = W,(k)

+

c ( U ( k ) - C M A C ( Y ( k + l) , Y ( k ) ) )

Q

(3)

where g is the size of generalisation,

fl

is the content of the ith memory location (there are g locations to be updated), and is the learning rate. In essence, in this scheme, the constant gain controller helps the CMAC to learn the inverse of the plant. However, these two controllers are independent, and the conflicts between them are the source of the instability [4].

Proposed modjkation: Instead of having the system inverse solely learned by the CMAC, the constant gain controller and the CMAC are integrated, and will be made to approximate the system inverse together. In other words, the CMAC will be used to learn the difference between the system inverse and the constant- gain control. To carry out this modification, the constant gain control has to be calculated based on the same input vector as is applied to the CMAC. In eqn. 2, since the input vector to the CMAC is (YJk+l), Y(k)), the proportional control

U,

is modified to be U,(k)

=

P( YJk+l) - Y(k)). After the control U(k) = U,@)

+

U,&) is applied to the plant, the CMAC goes through the learning process according to

Wz(k

+

1) = W Z ( k )

+

q u ( k ) - U , ( k ) - U c ( k ) ) (4)

9

Note that, in eqn. 4, U,@) is calculated by P( Y(k+l) - Y(k)), and

U,@) is generated from the input vector (Y(k+l), Y(k)), where Y(k+ 1) is the actual system output at time step k+ 1.

Simulation results: As in [4], the nonlinear system used in the simulation is Y(k+l) = 0.5 Y(k)

+

sin( Y(k))

+

U@). This system is expected to track the command sin(2&/400), with each cycle containing 400 time steps. The results using the old and the modified schemes are shown in Figs. 2 and 3, respectively.

0

0 50 100 150 200 250

number of cycles : 7 a a

Fig. 2 Tracking error with original scheme

Tracking error reduces significantly, but then diverges. Horizontal axis: number of sinusoidal cycles, each cycle containing 400 time steps; vertical axis: largest tracking error in each cycle

,

10 10 10 10

O O

10

number of cycles 17oy31 Fig. 3 Tracking error with modified scheme

Tracking error reduces significantly and keeps stable. Horizontal axis: number of cycles in log scale, each cycle containing 400 time steps; vertical axis: largest tracking error in each cycle

(2)

In Fig. 2, the system starts to diverge around the 200th cycle, whereas in Fig. 3 the system shows no tendency to instability when the program stops at the 7500th cycle. We are very confident that the stability can be maintained forever because we observe that i l l aspects in the modified closed-loop control system are very stationary. Note that, in Figs. 2 and 3, the CMAC start to func- tion at the fifth cycle. Before that, only the constant gain controller is in charge. One can observe that the CMAC quickly and dramatically reduces the track error once it takes effect.

Corichi.riori.c; The better stability in the new scheme is due to the

fact that the destabilising interactions (as described in [4]) between the CMAC and the constant gain controller in the original scheme have been eliminated. Although the new CMAC scheme is demon- strated by simple examples, the ideas can be extended to sophisti- cated and practical nonlinear control problems. This might open a range of new research areas. In addition, the usefulness of the CMAC neural networks justifies the need for parallel-processing CMAC IC chips, which we have been developing for several years, and publish details of elsewhere.

References

1 A L B L S . J . S : 'A new approach to manipulator control: The cerebellar model articulation controller (CMAC)', J. Dyii. Sysi., Meus.

C'oirtrul, 1975, pp. 220-227

1ii1.1 F R . w T . GL.ANL. t . t i . , and KRAFT. L.G.: 'Application of general learning algorithm to the control of robotics manipulators', I t i t . J. Rohor. Res.. 1987. 6. pp, 8&98

dynamic control of an industrial manipulator using a neural- network-based learning controller', IEEE Trans. Rubor. Autonz.,

1990. 6. pp. 1-9

('HLV. F ( ' . and ( ' H A N G . c H.: 'Practical stability issues in CMAC neural network control systems', lEEE Truizs. Cunirol Sysi.

Tcdiriul., 1996. 4. pp. 86 91 2

3 LIIILLER. 1V.T. HEWE.S. R.H.. G L A N Z . F.H.. and KRAF-T. L.G : 'Real-time

4

10Gbit/s all-optical regenerative memory

using single SOA-based logic gate

R.J.

M a n n i n g , I.D. Phillips,

A.D. Ellis,

A.E.

Kelly,

A.J. Poustie a n d K.J. Blow

The authors report an all-optical regenerative memory operating at l0Gbit s using a semiconductor optical amplifier-based interferometer as a regenerator in an optical fibre loop. Error-free operation was measured afier more than 30 000 circulations of the stored data.

Iiitrod&m An all-optical memory is one of the possible building blocks for future ultra-high speed photonic networks. to be used, for ewmple. in the buffering and queuing of high bit rate ( 100GbitIs) packets for subsequent processing. Considerable progress has been made recently in the development of a stable, all-optical regenerative memory [I]. In this approach, two semiconductor optical amplifier (S0A)-based loop mirror interferom- eters, using the TOADELALOM [2 - 41 topology, were combined

in an optical fibre loop and regenerated an input data 'word' twice every circulation. Two different wavelengths were used to distin- guish 'probe' from 'switching' pulses. The second TOAD was used to wavelength convert 1 Gbitis data switched from the first TOAD

so that it was the correct wavelength to be regenerated in the first TOAD. Here. we describe a simpler architecture which uses ii single all-optical logic gate where 'pump' and 'probe' pulses are dis- tinguished by propagation direction. W e use a n ultrafkst nonlinear interferometer ( U N I ) [5. 61 and demonstrate an all-optical regenerative memory operating at 10Gbitis. The advantages of regenera-

tion upon each circulation are retained in this configuration. In principle it is also possible to use polarisation diversity for single gate applications [7]. L W AmpliMc MOdUldn e D.u lnpvt

Fig. 1 Siheiiiutic. diugrun? of e.upcriiiioziu1 .sc'iup

Single gate memory comprises U N I , which acts as a single gate regenerator. and -1 km of dispersion shifted fibre. which is the storage ring

E.uperinzmt: A schematic diagram of the experimental arrange- ment is shown in Fig. 1. The pulse source used was a IOGHz external cavity modelocked semiconductor laser (ECMLL), which produced 3.5ps pulses (FWHM) at a wavelength of 1550nm. These pulses were split using a 3dB coupler; one output being used for the probe pulse input to the UNI. The other output from the coupler was passed through a lithium niobate amplitude modulator driven by a 10Gbitis. pseudo-random data sequence. These data pulses were amplified in erbium-doped fibre amplifier (EDFA) and passed through an acousto-optic modulator (AOMI), which was usually non-transmitting. Another coupler allowed 20% of each data pulse into the memory loop. Data pulses in the loop were amplified using an EDFA, filtered to remove unwanted spontaneous emission, and introduced as switching pulses into the UNI. The switched probe pulses from the UN1 passed through -1km of dispersion shifted fibre (having a dispersion zero at a wavelength of -1550nm), a further EDFA and an acoustic-optic modulator (AOM2) (usually transmitting). After the coupler, these pulses became switching pulses for the next circulation of the loop memory, 20% passing to a bit error rate test (BERT) set via a high-speed optoelectronic detector.

The UN1 was used in a counter-propagating configuration simi- lar to that described in [5, 61. Probe pulses input to the UN1 passed through an EDFA, to control the probe power. and were launched at 45" to the axes of polarisation maintaining (PM) fibre, thus splitting into orthogonally polarised pulse pairs. The pulse pair were separated by 15ps after propagation through 7m of fibre due to polarisation mode dispersion (PMD), and were input to a polarisation insensitive SOA with a mean power of --I dBm. After the SOA the pulse pair was launched at 45" into a second 7m length of PM fibre, so the pulse pair suffered a delay of the oppo- site sense and hence recombined. The resultant pulse passed through a fibre polariser

P.

Switching pulses were input via the 3 dB coupler with a mean power of -2dBm and were counter-propagating to the probe pulse pair. The switching pulse causes a change in the gain and hence the refractive index of the SOA and this affects the relative phase of the pulse pair, a phase difference of JC resulting in a polarisation rotation of 90" at the polariser of the UNI. The SOA used here had a gain recovery time of Sops at a current bias of 400mA, and an alpha parameter of -9 at 1550nm [8]. Error-free operation at IOGbitis was easily achievable by adjustment of the probe pulse pair power, which acted as a 'holding beam' to reduce the effective lifetime of the SOA [4, 61.

The memory operation was started by simultaneously switching AOMI to transmission mode and AOM2 to rejection mode (both acousto-optic modulators had rise and fall times of -2011s). This allowed pulses to be loaded into the loop and prevented pulses already in the loop from being recirculated. The loop was filled with the required data pulses and the modulators were then both returned to their initial state, after which time the data in the memory was allowed to continuously recirculate up to a maximum time of -185ms. This limit was imposed by the timing electronics