Characters and Strings
 — a finite set of (abstract) symbols, the alphabet
whose elements are called letters (or characters)
* — (infinite) set of all finite sequences over ,
called words (or strings)
a word w* has length, len(w) = the number of
letters in the sequence w (0, 1, 2, …)
 denotes the empty (or null) string with len() = 0
22C:135 – Part I
1
Language Operations
A language L is a subset of *, L *
The empty set is written as , and , , and {} are
all distinct entities
Languages can be combined with the familiar set
operations: union, intersection, and complement
Languages can also be concatenated by forming all
the possible combinations of concatenations of
their member strings. For L1,L2 *,
L1•L2 = {xy | xL1 and yL2}
22C:135 – Part I
2
Language concatenation behaves as a
“multiplicative” operator —  is a “zero” and {}
is an “identity”. In this sense we define powers
of a language.
For L * define L0 = {}, and inductively for n≥0
Ln+1 = Ln•L. Language powers satisify the usual laws
of exponents — Ln•Lm = Ln+m.
In terms of these powers we define two other
language operations. The Kleene closure (or star)

of L * is
Ln
*
L =

n=0

And the positive closure is
L+
=

Ln
n=1
22C:135 – Part I
3
The “Language” of
Regular Expressions
The regular expressions over an alphabet 
comprise a collection of formal notations. There
are strict rules for the formation of these
notations, and each is understood to describe a
language L *.
Since regular expressions have both a “syntax”
(rules of formation), and a “semantics” (the
associated language), we often speak of the
(meta) language of regular expressions.
22C:135 – Part I
4
Regular Expression Syntax
Regular expressions involve the symbols from 
plus several auxiliary symbols, and are defined inductively.
Basis case(s):
(a)  is a regular expression
(b)  is a regular expression
(c) each   is a regular expression
Inductive case(s):
If  and  are regular expressions, then so are:
(a) ( + )
(b) ( • )
(c) ( *)
22C:135 – Part I
5
Regular Expression Semantics
The meaning of each regular expression  is a
language () * whose definition parallels
the inductive definition of the syntax.
Basis case(s):
(a) () = 
(b) () = {}
(c) () = {}
Inductive case(s):
If  and  are regular expressions, then:
(a) ( + ) = ()  ()
(b) ( • ) = () • ()
(c) (*) = (())*
22C:135 – Part I
6
Regular Languages
L * is a regular language if there exists a
regular expression  so that L = ()
Regular expressions  and  are equivalent,   ,
provided that () = ()
To avoid excessive parentheses in regular
expressions, we assign precedence to the
operations: * highest, • intermediate, and
+ lowest; also, • may be omitted. For instance,
the regular expression (0•(1*)) is written 01*.
22C:135 – Part I
7
Example 1.1.3(g) — 00+11+101
This regular expression describes {00, 11, 101}.
Example 1.1.3(i) — 0*1*
The meaning in this case is the language
{, 0, 1, 00, 01, 11, 000, 001, 011, 111, … }
= {0p1q | p,q ≥ 0}.
Briefly, and informally, all strings where '1' is
followed by '1' or nothing
Example 1.1.3 (h) — 0*(10*10*)*
The language described by this regular expression
contains the strings:
 , 0, 00, 000 , …
11, 110, 101, 1100, 1001, 1010, 11000, …
011. 0110, 0101, 01100. 01001, …
Briefly, and informally, all strings with an even
number of '1's.
22C:135 – Part I
8
Theorem 1.1.1: For any regular expressions , 
and ,
(i)  +   +  ,
(ii) ( + ) +   + ( + ),
(iii) +    +   ,
(iv) ( )   ( ),
(v) ( + )   +   ,
(vi) ( + )     +  ,
(vii)       ,
(viii)       ,
(ix) *   ,
(x) ( + )*  *,
(xi) ( )*  ( )*  ,
(xii) (*)*  *,
(xiii) (* *)*  ( + )*,
(xiv) ( *)*  + ( + )*.
22C:135 – Part I
9
Assertion: the regular languages are the smallest
family of languages containing the finite
languages and closed under union, concatenation,
and star.
• each finite language is regular
• regular languages are “closed under”
union, concatenation, and star
• any family of languages that contains
the finite languages and is closed under
union, concatenation and star also
contains all the regular languages
Assertion: a regular expression denotes an
infinite language only if it includes *, and (*)
is infinite unless    or   .
22C:135 – Part I
10
Language Transformations
Let  and  be alphabets. A function h: * is
a homomorphism.
A homomorphism is also tacitly understood to
serve as a function h: * * as determined by
letter-by-letter extension inductively defined
by h() = , and h(x ) = h(x) h() for all x * and
 .
Finally, homomorphisms may be applied to languages,
h: p(*) p(*) , by the element-wise extension
h(L) = {h(x) | x  L} for each L *.
Note: p(S) is the collection of all subsets of set S
22C:135 – Part I
11
Let  and  be alphabets. A function s: p(*) is
a substitution (i.e., each letter of  is associated
with a language over ). A substitution is called
regular if each of the languages s() is regular.
A substitution is also tacitly understood to
serve as a function s: * p(*) as determined
by letter-by-letter extension inductively defined
by s() = {}, and s(x ) =s(x) s() for all x * and
 .
Finally, substitutions may be applied to languages,
s: p(*) p(*), by element-wise extension
s(L) =

s(x).
xL
22C:135 – Part I
12
Note that every homomorphism h can be regarded
as a substitution h' where if h() = w, h'() = {w}.
While homomorphisms map each letter to a unique
string, a general substitution provides numerous
optional strings for transforming each letter. For
this reason substitutions may be regarded as
“non-deterministic” homomorphisms.
Every homomorphism has a finite description, but
a substitution need not have a finite description.
If the languages associated with letters are
infinite, a finite description of the substitution
may be impossible.
22C:135 – Part I
13
Theorem 1.1.3: For each regular language R *
and each regular substitution s, s(R) is regular.
Corollary 1.1.4: For each regular language R *
and each homomorphism h, h(R) is regular.
So far we know that the regular language family
is closed under union, (set) concatenation,
Kleene closure (star), regular substitution, and
homomorphism.
22C:135 – Part I
14
Finite State Automata
Definition 1.2.1: A deterministic transition system
(DTS) is a triple, T = (S, , ), where S, the states,
and , the input alphabet, are finite non-empty
sets, and  is a function,  : S S, the next-state
function.
Definition 1.2.2: If T = (S, , ) is a deterministic
transition system, then the transition function,
* is the function, * : S *S, defined inductively
for all sS by: *(s, ) = s, and for all x* and
, *(s, x) = (*(s, x), ).
22C:135 – Part I
15
State Diagrams
State transitions may be presented either in
tabular or diagram form.
Example 1.2.1.

s0
s1
s2
0
s0
s2
s2
1
s1
s1
s2
Table of a next-state function
1
s
0
s
s1
0
0
1
2
0,1
State diagram of next-state function
22C:135 – Part I
16
Finite State Recognizers
Definition 1.2.3: A deterministic finite acceptor
(DFA) A is a 5-tuple A = (S, , , s0, R), where
(S, , ) is a deterministic transition system,
s0S is the start state, and R  S is the set of
recognizing (or accepting) states.
The language recognized (or accepted) by A is
L(A) = {x * | (s0, x) R}. A language L* is
DFA-recognizable if there exists a DFA A so that
L = L(A).
22C:135 – Part I
17
Example 1.2.2.
1
s
0
s
0
0
s
1
1
2
0 ,1
The language accepted is 0*11*
22C:135 – Part I
18
Example 1.2.3
L(A3) = all words ending with '101'
A3
0
s
0
1
0
s
0
1
0
s
2
1
1
s
3
1
I. Only words ending with 101 are recognized
• to reach s3 last letter is 1, prior state s2
• to reach s2 last letter is 0, prior state s1 or s3
• to reach s1 or s3 must use 1
II. All words ending with 101 are recognized
For input x101, after processing x, A3 is in some
state t, but (t, 101) = s3 for all states t.
22C:135 – Part I
19
The DFA model admits an obvious table driven
algorithm to automate testing membership of
candidate strings — for w  * , is w  L(A)?
Theorem 1.2.1: the complement of each
DFA-recognizable language is DFA-recognizable.
22C:135 – Part I
20
Non-deterministic Automata
Definition 1.2.4: A non-deterministic transition
system (NTS) is a triple, T = (S, , ), where S, the
states, and , the input alphabet, are finite nonempty sets, and  is a function,  : S p(S), the
next-state function.
Definition 1.2.5: If T = (S, , ) is a non-deterministic
transition system, then the transition function, *, is
the function, *: S * p(S), defined inductively for
all sS by: *(s, ) = {s}, and for all x* and ,

*(s, x) =
t
22C:135 – Part I
(t,).
x)
*(s,
21
Definition 1.2.6: A non-deterministic finite acceptor
(NFA) A is a 5-tuple, A = (S, , , s0, R), where
(S,  , ) is a non-deterministic transition system,
s0S is the start state, and R  S is the set of
recognizing (or accepting) states.
The language recognized (or accepted) by A is
L(A) = {x * | (s0, x)R≠}. A language L  * is
NFA-recognizable if there exists a NFA A so that
L = L(A).
For an NFA A as in Definition 1.2.6 and x*, say
x=12 … k (k≥0) with i(1≤i≤k), we say x has a
run s0, s1, … , sk in A if s1 (s0, 1), s2 (s1, 2), ... ,
sk (sk-1, k). A run is recognizing (or accepting)
if skR.
22C:135 – Part I
22
Example 1.2.4.
NFA
s
0
s1
0
0
s2
0,1
• inputs ending in '1' have one run ending
at s0
• inputs ending in '10' have two runs,
one ending at s0 and one at s1
• inputs ending in '00' have three runs, one
ending at s0, one at s1, and one at s2
•  recognizes words ending in '00'
22C:135 – Part I
23
Definition 1.2.7.
• A state t of a transition system is said to be
reachable from state s if there is x* so that
(s, x)=t (or t (s, x) in the non-deterministic
case).
• A state is called a generator if every state is
reachable from it.
• A transition system is called strongly connected
if every state is a generator.
22C:135 – Part I
24
Examples
s2 reachable from
s0, but not viceversa
A
1
s
0
s
0
s
1
0
0 ,1
Strongly connected
3
0
s
0
22C:135 – Part I
2
1
0
1
s
0
1
1
0
s
2
1
s
3
1
25
Note that each DFA is (effectively) an NFA — just
one where there is always exactly one possible
next state. Hence every DFA-recognizable language
is NFA-recognizable.
Theorem 1.2.2: Each NFA-recognizable language is
DFA-recognizable.
22C:135 – Part I
26
Example 1.2.4 revisited
NFA
s
0
s1
0
0
s2
accepts words
ending with '00'
0,1
DFA
1
0
s0
s0s1
0
1
1
1
0
0
s0s2
22C:135 – Part I
s0s1
s2
the subset construction
yields 4 more states unreachable from
s0
27
-move Automata
Definition 1.2.8: A null-move (or -move) nondeterministic transition system ( -NTS) is a triple,
T = (S, , ), where S, the states, and , the input
alphabet, are finite non-empty sets, and  is a
function, : S({}) p(S), the next-state
function.
When (s, ) is defined (i.e., not empty) it is called
a spontaneous transition — the state changes without input being
read.
The  -moves significantly complicate the ways in
which transitions cascade. So before we define
how an  -NTS responds to a sequence of inputs,
we must address the cascading of  -moves.
22C:135 – Part I
28
The behavior produced by multiple consecutive
-moves of an  -NTS is the first thing we capture
in our analysis.
Definition 1.2.9: for state s of an  -NTS,
-closure(s) is a set of states defined inductively:
(i) s -closure(s),
(ii) if t -closure(s) and u(t, ), then
u -closure(s),
(iii) nothing belongs to the -closure(s) unless it
follows from finitely many applications of (i) and (ii).
Also, for a set of states T,
-closure(T) = sT -closure(s).
22C:135 – Part I
29
Example 1.2.7
a
,0
b
-closure(a) = {a,b}
-closure(b) = {b}
1

-closure(c) = {a,b,c}
c
The basic idea of a run in an -NTS is that any
number of -moves may precede and follow reading
each letter from the input stream.
Definition 1.2.10: For -NTS T = (S, , ), the
transition function *: S*  p(S) is defined
inductively for each sS by:
(i) *(s, ) = -closure(s),
(ii) for x * and  ,
*(s, x) = -closure(
22C:135 – Part I

(t,)), where T = *(s,x).
tT
30
Definition 1.2.11: -NFA and -NFA recognizability
— just like NFA, except underlying transition
system is -NTS.
Note that any NFA is an -NFA so each NFArecognizable language is -NFA-recognizable.
Theorem 1.2.4: Each NFA-recognizable language
is NFA-recognizable.
22C:135 – Part I
31
Example 1.2.8.
B
A
0,1
a
,0
a
b
1
1
1
b
1
1
1
0,1

c
1
c
1
-NFA
22C:135 – Part I
NFA
32
Theorem 1.3.1: every DFA-recognizable language is regular. Hence
by Theorems 1.2.4 and 1.2.1 each
NFA-recognizable and -NFA-recognizable
language is also regular.
Theorem 1.3.2: each regular language is -NFArecognizable (and a single accepting state is
sufficient). Hence by Theorems 1.2.4 and 1.2.1 each
regular language is also NFA-recognizable and
DFA-recognizable.
22C:135 – Part I
33
Example 1.3.1.
s
L0 = +0
0
11
1
1
1
L(A) = L
13
L0 = 
13
0,1
s
3
s3
2
0
1 = L0  L0 (L0)* L0
L13
13
11
11
13
=  + (+0)(+0)*

See Dfa2regex in our class directory for a
Prolog program to perform these computations.
22C:135 – Part I
34
01*+1
01*
1*


1



1


0


22C:135 – Part I
35
Corollary 1.3.3: the regular languages are closed
under complementation and intersection.
Definition 1.3.1: the reversal of 12 … k * is
(12 … k)R = kk-1 … 1. The reversal of L  * is
LR = {xR | x  L}.
Theorem 1.3.4: for each regular language L  *,
LR is also regular.
22C:135 – Part I
36
The Pumping Lemma
Theorem 2.1.1(Pumping Lemma): Let L * be a
regular language. Then there exists an integer N (depending on L)
with the property that for each
zL with len(z)≥N there exist u,v,w* so that
i) z=uvw,
ii) len(uv)≤N and len(v)≥1, and
iii) uvkwL for each k=0,1,2, ...
22C:135 – Part I
37
Example 2.1.1.
Let  = {0, 1} and L = {0n1n | n≥0}. L is not regular.
Example
2
L = {0n | n≥0} is not regular.
Example 2.1.3.
Let L = {x* | len(x) is odd and the first and
middle symbols of x are the same},
where  = {0, 1}. L is not regular.
22C:135 – Part I
38
Proof Strategies
Prove language X regular:
known regular language
fop
X
fop is a “friendly operation” — one
that preserves regular languages
Prove language X non-regular:
X
fop
22C:135 – Part I
known non-regular language
39
Language quotients
Definition 2.2.1: Let K,L *. The left-quotient of
K by L is K\L = {y* | xK so that xyL}. The rightquotient of K by L is K/L = {x* | yL so that xyK}.
Examples
(01)*/1 = (01)*0
(01)*/11 = 
0*\0*1* = 0*1* 0*1\0*1* = 1*
Theorem 2.2.1: If K * is regular and L * is
arbitrary, then K/L and L\K are regular (but L/K
and K\L may not be).
22C:135 – Part I
40
Example 2.2.6.
Consider L = 0*1*  {1m0n1n | m≥1, n≥0}.
L cannot be proven non-regular by means of the
pumping lemma — the first letter of any string in
L can be pumped any number of times.
However, L can be proven non-regular by mapping
it to a known non-regular language.
Step 1: Let L1 = L  1+0*1* = {1m0n1n | m≥1, n≥0}
and L1 is regular if L is.
Step 2: Let L = 1+0\L = {0n-11n | n≥1} and
2
1
L2 is regular if L1 is.
Step 3: Let L3 = {}  {0}•L2 = {0n1n | n≥0}
and L3 is regular if L2 is. But L3 is not regular, and
hence L cannot be regular.
22C:135 – Part I
41
Theorem 2.2.3: for DFA A with N states
(i) L(A) ≠  iff there is zL(A) with len(z) < N
(ii) L(A) is infinite iff there is zL(A) with
N ≤ len(z) < 2N.
Theorem 2.2.3 assures us that there are algorithms
to test each of the following properties of DFAs:
• L(A) =  — check if zL(A) for all len(z)<N
•L(A) = * — test L(A) = 
•L(A) infinite — check if zL(A) for all N≤len(z)<2N
•L(A1)  L(A2) — test if L(A1)  L(A2) = 
• L(A1) = L(A2) — test if L(A1)  L(A2) and
L(A2)  L(A1)
22C:135 – Part I
42
Nerode equivalence
Definition 2.3.1: For a language L  *, define the
binary relation L-equivalence, written L, on * as
follows: for each x,y* x L y if x\L = y\L. The number of equivalence
classes is the index of L.
Assertion: L-equivalence is an equivalence relation
on * for every L  *.
• x  x for all x *
• x  y implies y  x for all x,y *
• w  x and x  y imply w  y for all w,x,y *
Example: L = 0*
0  00 since 0\L = 0* = 00\L
0 ± 1 since 0\L = 0* but 1\L = 
L has two classes, [0] = 0* and [1] = strings with ‘1’
22C:135 – Part I
43
Theorem 2.3.2: L * is regular if and only if L has
Finite index.
Corollary 2.3.3: the index of regular language L
Is the smallest number of states of any DFA that
recognizes L, and each equivalence class of L consists of exactly
those input sequences having runs that reach a corresponding
state of this
minimal state DFA.
22C:135 – Part I
44
Example 2.3.5.
In Example 2.3.3 it is determined that for L = 0*1*
the L-equivalence classes are
[] = 0*
[1] = 0*11*
[10] = 0*11*0(0+1)*.
According to the proof of Theorem 2.3.3, the corresponding minimal
DFA accepting L is
[ ]
1
0
[10]
[1]
0
1
0,1
22C:135 – Part I
45
Structured Programming Analysis
We define the collection s of regular expressions by:
•  s
the null statement is in s
• s s for all atomic statements s
all atomic statements are in s
• 12 s for each 1,2 s
sequential execution of statement sequences
•  T1 +  F2 s for each 1,2 s and each test
conditional statements: if  then 1 else2
• ( T1)*  F s for each1 s and test 
while loops: while  do1
• ( F1) *  T s for each 1 s and each test
negated while loops: while not  do 1
22C:135 – Part I
46
Theorem 2.4.1: There is a goto-program whose computational
sequences are not described by s
(i.e., not “equivalent” to any structured program).
22C:135 – Part I
47
1
F
1
T

2
3
T
F

4

2
halt
(§) 1(1 2 2)* (1 3 + 1 2)4.
F F
22C:135 – Part I
T
F T
48
Sequential machines
Definition 3.1.1: A deterministic generalized
sequential machine (DGSM) M is a 6-tuple,
M = (S, , , , , s0), where (S, , ) is a deterministic
transition system,  is a finite non-empty set (the
output alphabet), s0S (the start state), and
: S* (the output function).
We also extend  to the closely related function
*: S** operating on sequences of inputs
with the inductive definition (for all sS):
*(s, ) = ,
*(s, x) = *(s, x) ((s, x), ) for all x* and.
We regard M as defining a function, M: * *,
namely M(x) = *(s0, x), and we refer to such
functions as DGSM functions.
22C:135 – Part I
49
Example 3.1.1.
DGSM to delete second 'a' in input
s0
a/a
s1
b/b
a/ 
s2
a/a
b/b
b/b
state
s0
s1
s1
s1
s2
input
a
b
b
a
b
output
a
b
b

b
s2
Sample run
22C:135 – Part I
50
Definition 3.1.2: A function f: * * is called
extendible if f() = , and for each x,y*, there is
z* so that f(xy) = f(x) z.
Definition 3.1.3: Given an extendible function
f: ** and x* we associate a derived function
fx: ** defined for each y* by fx(y) = z if
f(xy) = f(x) z.
Definition 3.1.4: An extendible function f: * * is
said to be of finite index if the set of distinct
derived functions {fx| x*} is finite. The cardinality
of this set is called the index of f and f is of
infinite index if this set is infinite.
22C:135 – Part I
51
Example 3.1.6.
Consider the “unit delay” function f: {0, 1}*  {0, 1}*
defined by f() = , and f(12 … k) = 12 …k-1
for k≥1, and i (1≤i≤k). Clearly f is extendible.
For x,y* and y ≠ , f(0y) = 0f(y) = f(0)f0(y) = f0(y) and so f0(y) = 0f(y).
Also f(x0y) = x0f(y) = f(x0)fx0(y) = xfx0(y) implies that fx0(y) = 0f(y).
Finally since
f(0 ) =  =   = f(0) , f0() = . Similarly fx0() = . Hence we see that fx0 =
f0 for all x*
Similar analysis shows that fx1 = f1 for all x*.
Since f(01) = 0 and f(11) = 1, f0(1) = 0 while
f1(1) =1, and therefore f0 ≠ f1.
Finally note that f(y) = f(y) = f(y), so f = f (and
f(1) = ). Hence there are three distinct derived functions {f, f0, f1}, so
f has index 3.
22C:135 – Part I
52
Theorem 3.1.2: a function f: ** is a DGSM
function if and only if it is extendible and of finite
index.
Example 3.1.8.
From the unit delay function of Example 3.1.6
we obtain the DGSM
f
1/ 
0/ 
0/1
0/0
f0
f1
1/1
1/0
Theorem 3.1.3: if M is a DGSM and L * is regular,
then M(L) * is also regular.
22C:135 – Part I
53
State Minimization
Definition 3.2.1: Let M and N be DGSMs with M = N = .
For integer k>0, state sSM is k-equivalent to state tSN if M(s, x) = N(t,
x) for all x* with len(x) ≤ k.
We write s k t. States s and t are equivalent, written
s  t, provided that s k t for all k>0. A DGSM is called
distinguished provided it has no two distinct states
that are equivalent.
Equivalence and k-equivalence are clearly equivalence relations and
hence partition the state sets into equivalence classes. We denote the
collection of equivalence classes of  and k by the notations
 and k, respectively; for equivalence classes we
write [s] = {t | s  t } and [s]k = {t | s k t }.
22C:135 – Part I
54
Example 3.2.1.
a/1
s
0
a/0
s1
b/1
b/0
a/0
s2
b/1
1 = [s0], [s1, s2]
Definition 3.2.2: Let M and N be DGSMs with M = N = .
Then M and N are equivalent DGSMs provided that for
each state of M there is an equivalent state of N, and
conversely.
22C:135 – Part I
55
Lemma 3.2.1: Let M = (S, , , , , s0) be a DGSM
and s,tS. Then for each k>0
(a) s k+1 t if and only if s 1 t and
(s, ) k (t, ) for all ,
(b) k+1 = k implies n = k for all n > k, and
(c) k+1 = k implies s  t if and only if s k t.
Theorem 3.2.2: Let DGSM M = (S, , , , , s0) have N≥2 states. Then
for s,tS, s  t if and only if s N-1 t.
22C:135 – Part I
56
DGSM state equivalence algorithm
1. Compute 1 — this is accomplished by “inspection”
of the output function for single letter input.
2. Repeat for k = 1, 2, 3, … until no classes split or k = N-1
Given k, compute k+1 as follows: for each pair of
states s and t with s k t, if (s,) k (t,)
for all , then s k+1 t; otherwise split [s]k
into [s]k+1 and [t]k+1.
3. At termination k = .
22C:135 – Part I
57
Example 3.2.3
s
a/0
s
0
b/1
a/1
s
b/1
b/0
s
3
b/0
a/0
s
4
a/1
a/0
s
0
b/1
a/1
s
5
1
b/1
a/0
2
2 = [s0,s4] [s1,s3] [s2,s5]
b/0
b/1
b/1
a/0
s
1 = [s0,s1,s3,s4] [s2,s5]
b/1
a/0
s
a/0
2
b/1
s
1
4
22C:135 – Part I
s
3
b/0
a/0
s
a/1
5
58
Isomorphic DGSMs
Definition 3.2.3: Let M = (SM, , , M, M, s0) and
N = (SN, , , N, N, t0) be DGSMs. Then M is state
isomorphic to N if there is a total function : SM  SN
that is both injective and surjective (i.e., 1-1 and
onto) so that for all sSM and 
(i) (s,) = ((s),), and
(ii) (M(s,)) = N((s), ).
The function  is called a state isomorphism (or
sometimes just an isomorphism).
in M
t

pictorially
in N
22C:135 – Part I
/x
s
s'

/x
t'
59
1
a/0
b/0
B
a/0
b/0
2
a/1
a/1
b/1
A
C
a/1
b/1
a/1
3
a/0
b/0
b/0
4
isomorphic via
22C:135 – Part I
a/0
b/0
b/0
D
s
1
2
3
4
 (s)
D
A
C
B
60
Lemma 3.2.4: Let M = (SM, , , M, M, s0) and
N = (SN, , , N, N, t0) be DGSMs with sSM and tSN.
Then if s  t, M(s, x)  N(t, x) for all x*.
Theorem 3.2.5: Among all DGSMs equivalent to a
given DGSM there is a unique one (up to isomorphism)
which has the fewest states and is distinguished —
this is called the reduced machine.
22C:135 – Part I
61
Example 3.2.2 (again)
s
a/0
s
0
b/1
a/1
s
b/1
a/0
2
b/0
b/1
b/1
a/0
s
1
s
3
b/0
a/0
s
4
a/1
a/1
s0s4
s2s5
b/0
a/0
reduced machine
b/1
s1s3
22C:135 – Part I
5
a/0,b/1
62
Multi-tape Automata
Definition 3.4.1: a non deterministic n-tape
acceptor (n-NFA) A is a (n+5)-tuple,
A = (S, , , s0, S1, … , Sm, R) where S is a finite
non-empty set (the states),  is a finite non-empty
set (the alphabet), s0S, Si,R  S (states that read
from tape i (1≤i≤n), and final or recognition states,
respectively), : S({})  p(S) (the next-state
function); the subsets Si are required to partition
the state set S, that is, S = S1  S2  …  Sm, and
Si  Sj =  if i≠j.
If for every state s and letter , (s,) has exactly
one element and (s,) is empty, then A is
deterministic (n-DFA).
22C:135 – Part I
63
Definition 3.4.2: if A = (S, , s0, , S1, … , Sn, R) is an
n-NFA, an instantaneous description (ID) for A is an
(n+1)-tuple <s, w1, … , wn> where sS is the current
state, and wi* (1≤i≤n) is the remaining content of
the ith input tape.
Definition 3.4.3: IDs =<s, w1, … , wn> and
=<t, x1, … , xn> of a n-NFA A = (S, , s0, , S1, … , Sn, R) determine a runstep, written  | , provided that
t(s,), sSi, wi =  xi, and wj= xj for j≠i; a run of
length k≥0 from ID  to ID , written  |* , is
a sequence of IDs, 0, … , k (k≥0) so that =0, =k,
and i | i+1 for 0≤i<k.
Definition 3.4.4: the relation recognized (or accepted)
by n-NFA A = (S, , s0, , S1, … , Sn, R) is
L(A) = {<w1, … , wn> | wi* (1≤i≤n),
<s0, w1, … , wn> |* <s, , … , >,
and sR}.
22C:135 – Part I
64
Example 3.4.1.
a
1
2
b
a
b
2-DFA recognizing the
“diagonal language”
b
3
4
a
a,b
Figure 3.4.2

1
a
22C:135 – Part I
2
a
2-NFA accepting
a* a*
65
Theorem 3.4.1: for each n>1, there is a relation
accepted by an n-NFA that is accepted by no n-DFA.
Theorem 3.4.2: for each DGSM M there exists a 2-DFA
A with L(A) = {<x, M(x)>| x*}.
Theorem 3.4.3 (projection theorem):
{x1*|  x2, … , xn* so <x1, x2, … , xn>L(A)} is
regular for each n-tape acceptor A.
Corollary 3.4.4: algorithms exist to determine if the
relation accepted by an n-NFA is empty, finite or
infinite.
22C:135 – Part I
66
Theorem 3.4.5 (2-NFA pumping lemma): Let L * *
be a 2-NFA relation. Then there exists an integer N
(depending on L) with the property that for each
<z1, z2>L with len(z1)+len(z2) ≥ N, there exist u1,u2,v1,v2,w1,w2* so that
i) zi = uiviwi, i = 1, 2
ii) len(u1v1)+len(u2v2) ≤ N, and len(v1)+len(v2) ≥ 1, and
iii) <u1vkw1, u2vkw2>L for each k= 0, 1, 2, …
1
2
Theorem 3.4.6: Let L1 and L2 be two relations accepted
by n-NFAs. Then L1  L2 is accepted by an n-NFA.
Theorem 3.4.7: For n>1 there are relations accepted
by n-NFAs (in fact n-DFAs) whose complement and intersection are
not accepted by an n-NFA.
22C:135 – Part I
67
Descargar

Characters and Strings