Chapter 6 Languages:
finite state machines
Yen-Liang Chen
Dept of Information Management
National Central University
6.1 Language: the set theory
of strings
We use  to denote a nonempty finite
set of symbols, collectively called an
 Definition 6.1. If  is an alphabet and
nZ+, we define the power of  as
follows: (1) 1=; and (2) n+1={xy
x, yn}, where xy denotes the
juxtaposition of x and y.
 Ex 6.1
Empty string and sentences
Definition 6.2. 0={}, where  denotes the empty
(1)Although , ; (2) {} since ; (3) {}
because {}=1.
We refer to the elements of + or * as strings,
words, sentences
Ex 6.2
Equal and concatenation
Definition 6.4. Two strings w1=x1x2 …xn+ and
w2=y1y2…ym+ are equal, written as w1=w2, if n=m
and xi=yi for 1in.
Definition 6.5. Let w=x1x2 …xn +. The length of w,
denoted as w, is n.
Definition 6.6. Let x=x1x2 …xn+ and y=y1y2…ym+
The concatenation of x and y, xy, is x1x2…xny1y2…ym.
The concatenation of x and  is x=x. The
concatenation of  and x is x=x.
Finally, the
concatenation of  and  is . Since x=x=x, the
element  is the identity for the operation of
Power, prefix and postfix
Definition 6.7. The power of x is
defined as: x0=, x1=x, and xn+1=x xn.
Ex 6.3
Definition 6.8. If x, y* and w=xy,
then x is a prefix of w, and if y, then
x is a proper prefix of w. Similarly, y is
a suffix of w; it is a proper suffix when
Ex 6.4: Consider the string w=abbcc.
What are the prefixes, proper prefixes,
suffixes and proper suffixes of w?
 Ex 6.6, If w=w1w2=w3w4, then (1) w1 is
a prefix of w3, or w3 is a prefix of w1;
and (2) w2 is a suffix of w4, or w4 is a
suffix of w2.
Let w=(abb)(cc)=(a)(bbcc)
Substring and language
Definition 6.9. If x, y, z* and w=xyz, then
y is called as a substring of w. When at
least one of x and z is different from , we
call y a proper substring.
Ex 6.7
Definition 6.10. For a given , any subset of
* is called a language over . This
includes , the empty language.
Ex 6.8, Ex 6.9.
the concatenation of
Definition 6.11. For languages A , B in * ,
the concatenation of A and B, denoted AB,
is {abaA, bB}.
Note that ABBA and ABBA.
Ex 6.10
Theorem 6.1. For A, B, C *, we have (a)
A{}={}A=A; (b) (AB)C=A(BC); (c)
A(BC)=ABAC; (d) (BC)A=BCCA; (e)
A(BC)ABAC; (f) (BC)A BACA.
x, xy in A; yz in B; z in C xyz in ABAC
But xyz not in A(BC)
Ex 6.11. A={x}, then (1) A0={}; (2) An={xn};
(3) A+={xn n1}; (4)A*={xn n0}
Ex 6.13. A={, x, x3, x4,…} and B={xn n0}.
Then A2=B2 but AB.
Ex 6.12
A={xx, xy, yx, yy}. A* is the language of all
strings w in * where the length of w is even.
A={xx, xy, yx, yy} and B={x, y}. What is BA*?
What is {x}{x, y}*? What is {x}{x, y}+?
What is {x, y}*{yy}?
What is {x}*{y}*?
Why {x}*{y}* {x, y}*?
Lemma 6.1. Let A, B*. If AB, then for all
nZ+, AnBn.
Theorem 6.2. For A, B *, we have
(a) AAB*,
(b) AB*A;
(c) AB  A*B*,
(d) AB  A+B+,
(e) AA*=A*A=A+,
(f) A*A*=A*=(A*)*=(A*)+=(A+)*;
(g) (AB)*=(A*B*)*=(A*B*)*.
Ex 6.14. Let ={0, 1} and A*, where each word in A
contains exactly one occurrence of the symbol 0. Then
the language can be defined as: (a) 0A, (b) 1x and x1
is in A, if x is in A.
Ex 6.15. Let ={(, )} and A*, where A contains those
nonempty strings of parentheses that are
grammatically correct for algebraic expressions. Then
the language can be defined as:
 (a) ( ) is in A;
 (b) For all x, y in A, we have
• (1) xyA, and
• (2) (x)A.
The reverse of string
Ex 6.16
 The reverse of x= x1x2 …xn is xR= xn
 We can define it recursively: (a) R=;
and (2) if x=zyn+1, where z in  and
y in n, then xR=(zy)R=(yR)z.
 Based on this definition, we can show
that for x1, x2*, we have
6.2 Finite state machine: a
first course
The machine can be in only one of finitely
many sates at a given time.
The machine will accept as input only a
finite number of symbols, referred to as
the input alphabet.
An output and a next state are determined
by each combination of inputs and internal
The machine operates in a deterministic
Finite state machine
A finite state machine is a five-tuple
M=(S, IA, OA, v, w), where S = the set
of internal states fro M; IA is the input
alphabet for M; OA is the output
alphabet; v: SIAS; w: SIA OA
Ex 6. 17
Ex 6.18
Ex 6.19
6.3 Finite state machine: a
second encounter
Ex 6.20. We want to
construct a machine that
recognizes each
occurrence of the
sequence 111 as it is
encountered in an input
string x*. This machine
is a recognizer of the
language A= {0, 1}*{111}.
For example, if
x=1110101111, then the
output is 0010000011.
Ex 6.21.
We want to recognize the occurrence of 111
that ends in a position that is a multiple of 3.
Ex 6.22.
We want to recognize the occurrence of 0101 in
an input string. Figure 6.12(a). We want to
recognize the occurrence of 0101 in an input
string but its start position is a multiple of four.
Figure 6.12(b).
Ex 6.23.
It is impossible to have a finite state
machine to represent A={0i1iiZ+}.
Suppose we can and let S=n1. Table 6.8
shows the state transition for string 0n+11n+1.
Since S=n, there must have two states si
and sj , where i< j, such that si=sj. Removing
the loop from si+1 to sj, we have the table
shown in Figure 6.10. This new sequence
means the machine can accept the string
x=0(n+1)-(j-i)1n+1. This is a contradiction.
Ex 6.24
One-unit delay
machine. If x=
x1x2…xn-1xn, then
the output will be
0 x1x2…xn-1.
Ex 6.25.
Two-unit delay
machine. If x=
then the
output will be 0
Definition 6.14.
For si, sjS, sj is reachable from si if si=sj or if
there is an input string x such v(si, x)=sj.
A state s is transient if v(s, x)=s for xIA*
implies x=. Once leaving, never go back to
A state s is sink if v(s, x)=s for xIA*.
A submachine of M. Let S1S and IA1IA. If
v1=vS1IA1:S1IA1S has its range within S1.
A machine is strongly connected if for any
states si, sjS, sj is reachable from si.
Transfer sequence
For a machine M, let si, sjS. An input string x is
called a transfer sequence from si to sj if (a) v(si,
x)=sj, (b) for any y with v(si, y)=sjyx.

Chapter 6 Languages: finite state machines