** Finite Automata**

**2.1 DETERMINISTIC FINITE AUTOMATA**

56 Chapter 2: FINITE AUTOMATA

on the unbounded one.

Even if one thinks, as we do, that the correct way to model computers and algorithms is in terms of an unbounded memory capacity, we should first be sure that the theory of finite automata is well understood. It turns out that their theory is rich and elegant, and when we understand it we shall be in a better position to appreciate exactly what the addition of auxiliary memory accomplishes in the way of added computational power.

A further reason for studying finite automata is their applicability to the
design of several common types of computer algorithms and programs. For
*example, the lexical analysis phase of a compiler (in which program units such *
as 'begin' and

### '+'

are identified) is often based on the simulation of a finite autotnaton. Also, the problem of finding an occurrence of a string within another*--for example, whether any of the strings air', water, earth, and fire occur in the*

*text of Elements of the Theory of Computation*

### t -

can also be solved efficiently by methods originating from the theory of finite automata.Input tape

Finite control

Figure 2-1

Let us now describe the operation of a finite automaton in more detail.

Strings are fed into the device by means of an input tape, which is divided into
squares, with one symbol inscribed in each tape square (see Figure 2-1). The
main part of the machine itself is a "black box" with innards that can be, at
any specified moment, in one of a finite number of distinct internal states. This
black box -called the finite control- can sense what symbol is written at any
position on the input tape by means of a movable reading head. Initially, the
reading head is placed at the leftmost square of the tape and the finite control is
set in a designated initial state. At regular intervals the automaton reads one
*symbol from the input tape and then enters a new state that depends only on the *

### t

All four of them do, three of them outside this page.2.1: Deterministic Finite Automata 57
*current state and the symbol just read *-this is why we shall call this device a
*deterministic finite automaton, to be contrasted to the nondeterministic version *
introduced in the next section. After reading an input symbol, the reading head
moves one square to the right on the input tape so that on the next move it will
read the symbol in the next tape square. This process is repeated again and
again; a symbol is read, the reading head moves to the right, and the state of
the finite control changes. Eventually the reading head reaches the end of the
input string. The automaton then indicates its approval or disapproval of what
it has read by the state it is in at the end: if it winds up in one of a set of final
states the input string is considered to be accepted. The language accepted
by the machine is the set of strings it accepts.

When this informal account is boiled down to its mathematical essentials, the following formal definition results.

Definition *2.1.1: A deterministic finite automaton is a quintuple M *

(K,~, J, s, *F) *where

*K *is a finite set of states,

~ is an alphabet,

s E K is the initial state,

*F *~ *K *is the set of final states, and

J, the transition function, is a function from *K *x ~ to *K. *

*The rules according to which the automaton M picks its next state are *
encoded into the transition function. Thus if *M * *is in state q *E *K * and the
symbol read from the input tape is *a *E ~, then *J(q, a) * E *K * is the uniquely
determined state to which K passes.

Having formalized the basic structure of a deterministic finite automaton,
*we must next render into mathematical terms the notion of a computation by *
an automaton on an input string. This will be, roughly speaking, a sequence
of configurations that represent the status of the machine (the finite control,
reading head, and input tape) at successive moments. But since a deterministic
finite automaton is not allowed to move its reading head back into the part of
the input string that has already been read, the portion of the string to the left
of the reading head cannot influence the future operation of the machine. Thus
a configuration is determined by the current state and the unread part of the
string being processed. In other words, a configuration of a deterministic
finite automaton (K,~, J, *s, F) is any element of K x *~*. For example, the
configur ation illustrated in Figure 2-1 is *(q2, ababab). *

The binary relation I- M holds between two configurations of 111 if and only
if the machine can pass from one to the other as a result of a single move. Thus
if *(q,w) *and *(q',w') are two configurations of M, then (q,w) *I-M *(q',w') *if and
*only if w *

### =

*aw' for some symbol a*E ~, and

*J(q, a)*

### =

*q'.*In this case we say

58 Chapter 2: FINITE AUTOMATA
that *(q,w) *yields *(q',w') *in one step. Note that in fact I-M is a function from
*K *x ~+ to *K *x ~*, that is, for every configuration except those of the form *(q, *e)
there is a uniquely determined next configuration. A configuration of the form
*(q, e) signifies that M has consumed all its input, and hence its operation ceases *
at this point.

We denote the reflexive, transitive closure of I-M by I-~; *(q,w) *I-~ *(q',w') *
is read, *(q,w) *yields *(q',w') *(after some number, possibly zero, of steps). A
string *w *E ~* is said to be accepted by *M if and only if there is a state q E F *
such that *(s, w) *I-~ *(q, e). * Finally, the language accepted by *M, L(M), *is
the set of all strings accepted by *M. *

Example 2.1.1: Let *M *be the deterministic finite automaton (K,~, J, s, *F), *
where

*K={qo,qd, *

~ = *{a,b}, *
s = *qo, *
*F *= *{qo}, *
and J is the function tabulated below.

*q * *(J * *J(q, (J) *
*qo a * *qo *
*qo b * *ql *
*ql * *a * *ql *
*ql * *b * *qo *

It is then easy to see that *L( M) *is the set of all strings in *{a, b} ** that have
an even number of *b's. For M passes from state qo *to *ql *or from *ql *back to *qo *
*when a b is read, but *111 essentially ignores a's, always remaining in its current
*state when an a is read. Thus M counts b's *modulo 2, and since *qo *(the initial
*state) is also the sole final state, M accepts a string if and only if the number *
of b's is even.

If *M *is given the input *aabba, *its initial configuration is *(qo, aabba). *Then
*(qo, aabba) *I-M *(qo, abba) *

I-M *(qo,bba) *
I-M *(ql,ba) *
I-M *(qO, a) *
I-M *(qo, e) *

Therefore *(qo, aabba) *I-j,{ *(qo, e), *and so *aabba *is accepted by *M. *

### <:;

2.1: Deterministic Finite Automata 59

Figure 2-2

The tabular representation of the transition function used in this example
is not the clearest description of a machine. We generally use a more convenient
graphical representation called the state diagram (Figure 2-2). The state
diagram is a directed graph, with certain additional information incorporated
into the picture. States are represented by nodes, and there is an arrow labeled
*with a from node q to q' whenever J(q, a) *

### =

*q'.*Final states are indicated by double circles, and the initial state is shown by a

### >.

Figure 2-2 shows the state diagram of the deterministic finite automaton of Example 2.1.1.Example 2.1.2: *Let us design a deterministic finite automaton M that accepts *
the language *L( M) *= *{w *E *{a, b} ** : *w *does not contain three consecutive *b's}. *

We let *M *= *(K,"L"J,s,F), *where

*K *= *{qo, ql , q2, q3 } , *

*"L,={a,b}, *
s = *qo, *

*F *= *{qO,ql,q2}, *
and J is given by the following table.

*q * *(J * _{J(q, }(J) *qo * ^{a }*qo *
*qo * ^{b }*ql *
*ql * ^{a }*qo *
*ql * ^{b }*q2 *
*q2 * *a * *qo *
*q2 * *b * *q3 *
*q3 * ^{a }*q3 *
*q3 * *b * *q3 *

*The state diagram is shown in Figure 2-3. To see that M does indeed accept *
the specified language, note that as long as three consecutive *b's *have not been
*read, M is in state qi (where *i is 0, 1, or 2) immediately after reading a run of
i consecutive *b's *that either began the input string or was preceded by an *a. *In
*particular, whenever an a is read and M is in state qo, ql, or q2, M returns to *

60 Chapter 2: FINITE AUTOMATA
*its initial state qo. States qo, ql, and q2 are all final, so any input string not *
*containing three consecutive b's will be accepted. However, a run of three b's *
*will drive AI to state q3, which is not final, and AI will then remain in this state *
*regardless of the symbols in the rest of the input string. State q3 is said to be *
*a dead state, and if AI reaches state q3 it is said to be tmpped, since no further *
input can enable it to escape from this state.O

Figure 2-3

Problems for Section 2.1

*2.1.1. Let M be a deterministic finite automaton. Under exactly what *
circum-stances is e E *L(AI)? *Prove your answer.

2.1.2. Describe informally the languages accepted by the deterministic finite au-tomata shown in the next page.

2.1.3. Construct deterministic finite automata accepting each of the following lan-guages.

(a) *{w *E *{a, b}' : *each *a *in *w *is immediately preceded by a *b}. *

(b) *{w E {a,b}' : w has abab as a substring}. *

( c) *{w *E *{a, b}' : w has neither aa nor bb as a substring}. *

(d) *{w *E *{a,b}' : w has an odd number of a's and an even number of b's}. *

(e) *{w *E *{a,b}' : w has both ab and ba as substrings}. *

2.1.4. A deterministic tinite-state transducer is a device much like a deter-ministic finite automaton, except that its purpose is not to accept strings or languages but to transform input strings into output strings. Informally, it starts in a designated initial state and moves from state to state, depending on the input, just as a deterministic finite automaton does. On each step, however, it emits (or writes onto an output tape) a string of zero or one or more symbols, depending on the current state and the input symbol. The state diagram for a deterministic finite-state transducer looks like that for a deterministic finite automaton, except that the label on an arrow looks like

2.1: **Deterministic Finite Automata **

*a, b *
*a *

*a *

*a *

*a,b *

*a *

*b *

**61 **
*a *

(1....---~O

*b *

*a,b *

*a,b *
*a,b *

*b * *a *

*a,b *

*b *
*b *

*a *

62 Chapter 2: **FINITE **AUTOMATA

*I *

^{bib }### 1Faa~

*bib * *ale *

*alw, which means "if the input symbol is a, follow this arrow and emit *
*out-put w". For example, the deterministic finite-state transducer over {a, b} *

*shown above transmits all b's in the input string but omits every other a. *

(a) Draw state diagrams for deterministic finite-state transducers over *{a, b} *

that do the following.

*(i) On input w, produce output an, where n is the number of occurrences *
*of the substring ab in w. *

(ii) On input w, produce output an, where n is the number of occurrences
*of the substring aba in w. *

*(iii) On input w, produce a string of length w whose ith symbol is an a if *
i

### =

1, or if i### >

1 and the ith and (i - l)st symbols of ware different;otherwise, the ith symbol of the output is a *b. * For example, on input
*aabba the transducer should output ababa, and on input aaaab it should *
*output abbba. *

(b) Formally define

(i) a deterministic finite-state transducer;

(ii) the notion of a configuration for such an automaton;

(iii) the yields in one step relation f- between configurations;

*(iv) the notion that such an automaton produces output u on input w; *

(v) the notion that such an automaton computes a function.

**2.1.5. A deterministic 2-tape finite automaton is a device like a deterministic finite **
automaton for accepting pairs of strings. Each state is in one of two sets;

depending on which set the state is in, the transition function refers to the
first or second tape. For example, the automaton shown below accepts all
pairs of strings *(Wl,W2) *E *{a,b}* x {a,b}* such that *

### IW21

= 2lwll·**... -.. ---i ** **r---.. **

, ,

, ,

**! ** **a,b! ****a,b **

### ~~ i

_{, }

**!**

_{, }

**a,b**, ,

, ,

**___ .. __________ ! ** **L _ .. _ .. _____________ .. __ .. ___ _ **

States for first tape

States for second tape

2.2: Nondeterministic Finite Automata 63 (a) Draw state diagrams for deterministic 2-tape finite automata that accept each of the following.

(i) All pairs of strings *(WI,W2) *in *{a,b}* *x *{a,b}* *such that

### IWII

=### IW21,

and wI(i)

*i-*

*w2(i)*for all i.

(ii) All pairs of strings *(WI, W2) *in *{a, b}' *x *{a, b} ** such that the length of

*W2 *is twice the number of a's in * ^{Wi }*plus three times the number of b's
in

*Wi.*

(iii) *{(anb,anbm) : n,m;:::: *O}.

(iv) *{(anb, ambn) : n, *m ;:::: O}.

(b) Formally define

(i) a deterministic 2-tape finite automaton;

(ii) the notion of a configuration for such an automaton;

(iii) the yields in one step relation f- between configurations;

(iv) the notion that such an automaton accepts an ordered pair of strings;

(v) the notion that such an automaton accepts a set of ordered pairs of strings.

2.1.6. This problem refers to Problems 2.1.5 and 2.1.6. Show that if

*f : *

~* f-t ~* is
a function that can be computed by a deterministic finite-state transducer,
then {( *w,*

*f *

*(w)) :*

*W*E ~*} is a set of pairs of strings accepted by some deterministic two-tape finite automaton.

2.1. 7. We say that state *q *of a deterministic finite automaton *M *= *(K, *~,J, *qo, F) *
is *reachable *if there exists *W * E ~* such that *(qo, w) *f-M *(q, e). * Show that
if we delete from *NI * any nonreachable state, an automaton results that
accepts the same language. Give an efficient algorithm for computing the
set of all reachable states of a deterministic finite automaton.

### liiJ

NONDETERMINISTIC FINITE AUTOMATAIn this section we add a powerful and intriguing feature to finite automata.

This feature is called nondeterminism, and is essentially the ability to change states in a way that is only partially determined by the current state and input symbol. That is, we shall now permit several possible "next states" for a given combination of current state and input symbol. The automaton, as it reads the input string, may choose at each step to go into anyone of these legal next states;

the choice is not determined by anything in our model, and is therefore said to
be *nondeterministic. * On the other hand, the choice is not wholly unlimited
either; only those next states that are legal from a given state with a given input
symbol can be chosen.

**64 ** **Chapter 2: FINITE AUTOMATA **