Programming Languages
2nd edition
Tucker and Noonan
Chapter 7
Semantics
Surely all this is not without meaning.
Ishmael, Moby Dick by Herman Melville
Copyright © 2006 The McGraw-Hill Companies, Inc.
Contents
7.1
7.2
7.3
7.4
7.5
Motivation
Expression Semantics
Program State
Assignment Semantics
Control Flow Semantics
Copyright © 2006 The McGraw-Hill Companies, Inc.
Semantics
Semantics is a precise definition of the meaning of a
program that is syntactically correct and valid
according to the type rules.
Semantics describes run-time behavior
A programming language is complete only when its
syntax, type system, and semantics are well-defined.
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.1 Motivation (for Semantics)
To provide an authoritative definition of the meaning
of all language constructs for:
1. Programmers
2. Compiler writers
3. Standards developers
A program written in a given language must always
produce the same results when given the same
data, regardless of the compiler or computer.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Ways to Define Meaning:
• Operational Semantics: What happens when a
program is run on a machine – how statements
change the machine’s state.
– Early languages were designed for a specific computer;
e.g, Fortran on IBM 709; today, semantics are general
• Axiomatic semantics – Chapter 18
– Abstract, based on formal logic
• Denotational semantics: Statements as program
state-transforming functions – (Chapter 8)
– Rigorous, mathematical, defines functions that specify
exactly how state is affected by each language object.
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.2 Expression Semantics – how to
interpret the meaning of an expression
Copyright © 2006 The McGraw-Hill Companies, Inc.
Semantic Issues
Notation (infix, Polish, etc.)
Precedence/associativity rules (if not completely
specified in the grammar)
Special issues
short-circuit evaluation
representational issues
subexpression evaluation and side effects
Copyright © 2006 The McGraw-Hill Companies, Inc.
Expression Notation – Infix
• Infix : a binary operator is between its two operands.
• What is the meaning of a + b – c * d
– Is it (((a + b)- c)* d)?
– Or (a + ( b – c) * d)? Or …?
• Without knowledge of the language, the expression is
ambiguous
• Languages that use infix may also use associativity
and precedence rules to disambiguate.
– Parentheses can be used to override normal evaluation
order to achieve the desired meaning.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Expression Notation
Four ways to write the expression for the
tree on the earlier slide
•
•
•
•
Infix: (a + b) - (c * d)
Polish Prefix: - + a b * c d
Polish Postfix: a b + c d * Cambridge Polish: (- (+ a b) (* c d))
Identify the subexpressions.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Expression Notation – Polish
• Completely specifying evaluation order leads to
large grammars; one alternative: use an
unambiguous notation
– Polish prefix: a binary operator precedes its operands
Polish Prefix: - + a b * c d
– Polish postfix: a binary operator follows its operands
Polish Postfix: a b + c d * -
• Preorder and postorder traversals of the parse tree
generate the Polish notations.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Pre-order: visit the root, visit the left subtree, visit the right subtree
Post-order: visit the left subtree, visit the right subtree, visit the root
Copyright © 2006 The McGraw-Hill Companies, Inc.
Expression Notation - Polish
• If all operators are binary, this notation eliminates
the need for parentheses, precedence rules, or
associativity rules
– a*b*c : **abc (left associativity) or *a*bc (right)
– a*b*c: abc** or ab*c* (which is left associativity?)
• Flaw: what about unary operators?
- a + b * c (+-a*bc won’t work)
One solution: Use a different symbol (e.g. ‘~’) for
the unary operator
Copyright © 2006 The McGraw-Hill Companies, Inc.
Expression Notation – Cambride prefix
• Polish prefix and postfix fail if operators aren’t binary
- a + b * c (+-a*bc won’t work)
• Solution 1: use different symbol for unary operators
• Solution 2: Cambridge prefix - operators can be n-ary:
Rewrite a + b – c * d as (-(+ a b)(* c d))
Rewrite a + b + c + d as (+ a b c d)
Rewrite -a + b * c as (+(-a)(*b c)
Lisp and Scheme use Cambridge prefix.
Operators precede operands, fully parenthesized.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Associativity of Operators – Table 7.1
Language
C-like
Ada
Fortran
+-*/
L
L
L
Unary - **
R
non
non
R
R
== != < ...
L
non
L
In some languages the meaning of a < b < c is not
the same as the meaning of(a < b) && (b < c)
Argues for non-associativity of logical operators
Copyright © 2006 The McGraw-Hill Companies, Inc.
Precedence of Operators – Table 7.2
Operators
Unary **
*/
+== !=
< <= ...
not (!)
log. and
log. or
C-like
7
6
5
4
3
7
2
1
Ada
3
5
4
3
2
2
2
1
1
Copyright © 2006 The McGraw-Hill Companies, Inc.
Fortran
3
5
4
3
2
2
2
1
1
Other Languages
Smalltalk: left-to-right evaluation, uniform
precedence; left associative
APL: right-to-left evaluation, uniform precedence,
right associative
APL example:
The following expression sorts a word list stored in
matrix X according to word length:
X[⍋X+.≠' ';]
Copyright © 2006 The McGraw-Hill Companies, Inc.
Summary
Grammars can capture some semantic features,
notably operator precedence and associativity
To simplify the grammar, languages with many
operators often write an ambiguous grammar and
use tables such as those in the previous examples
during translation.
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.2.3 Short Circuit Evaluation
Short-circuit evaluation starts evaluating from the left
and stops when the expression value is known
So a && b is evaluated as:
if a then b else false
Or
if (!a) false else B
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.2.3 Short Circuit Evaluation
Likewise
a || b is evaluated as:
if a then true else b
One reason for short-circuit is slightly more efficient.
Another reason: allows for situations where the second
operand might be undefined.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Example
Node p = head;
while (p != null && p.info != key)
p = p.next;
if (p == null) // not in list
...
else
// found it
// without short-circuiting, what happens if p is
null?
Copyright © 2006 The McGraw-Hill Companies, Inc.
Without Short-circuit Evaluation:
boolean found = false;
while (p != null && !found) {
if (p.info == key)
found = true;
else
p = p.next;
}
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.2.4: Expression Meaning
If an expression has been statically type checked its
meaning should depend only on sub-expression
values and operator meaning.
But …
• Number representation (finite-ness)
• Side effects
Copyright © 2006 The McGraw-Hill Companies, Inc.
Number Representation
In computers, unlike the real world, numbers are
finite: there is a largest and a smallest.
Associativity does not always hold.
Is (a + b) + c = a + (b + c) ?
What if a = IntMax, b = 3, c = -5 ?
See example for floats in a previous lecture.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Side Effects – of Functions or Expressions
“A function or expression is said to have a side effect
if, in addition to producing a value, it also modifies
some state or has an observable interaction with
calling functions or the outside world.
For example, a function might modify a global or a
static variable, modify one of its arguments, raise
an exception, write data to a display or file, read
data, or call other side-effecting functions.”
http://en.wikipedia.org/wiki/Side_effect_(computer_science)
Copyright © 2006 The McGraw-Hill Companies, Inc.
Side Effects – of Functions or Expressions
• Side effects are not necessarily bad
– They are how void functions accomplish their job
• However, sometimes they have unexpected
consequences, particularly if there are language
issues that are semantically undefined.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Side Effects & SubExpression Evaluation
What is the value of
when
a = b * i++ + c * i;
i = 2; b = 2; c = 5;
Order of evaluation for subexpressions is not defined
in most languages: (b * i++)may be evaluated
before or after (c * i)
In C-like languages, a is semantically undefined.
Or what about the expression
y + f(x, y);
if f modifies the value of y
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.3 Program State
“The state of a program is the binding of all active
[alive] objects to their current values.” – page 160
State-related maps:
1. The pairing of active objects with specific
memory locations (environment)
2. The pairing of active memory locations with their
current values (memory map)
Copyright © 2006 The McGraw-Hill Companies, Inc.
Example
Given a program with two variables, i and j, currently
associated with memory addresses 154 and 155 and
having values 13 and -1:
environment = { <i, 154>, <j, 155>}
memory = {<0, undef>, …, <154, 13>, <155, -1>, …}
program state = memory X environment
= {< i, 13>, <j, -1>}
Copyright © 2006 The McGraw-Hill Companies, Inc.
State Transitions
The individual steps that occur during a program run
can be viewed as a series of state transformations.
For example, in the previous case, executing
statement i = i + 2 * j produces this state:
11>, <j, -1>}
{< i,
A program trace (similar to what could be provided by
a debugger) shows changes in program state.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Figure 7.2 Factorial Program
// compute the factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while (i < n) {
7
i = i + 1;
8
f = f * i;
9
}
10 }
Copyright © 2006 The McGraw-Hill Companies, Inc.
// compute the
factorial of n
1 void main ( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while(i < n) {
7
i = i + 1;
8
f = f * i;
9
}
10 }
n
i
f
undef? undef undef
3
undef undef
Copyright © 2006 The McGraw-Hill Companies, Inc.
// compute the
factorial of n
1 void main( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while(i < n) {
7
i = i + 1;
8
f = f * i;
9
}
10 }
n
i
3
3
undef undef
1
undef
Copyright © 2006 The McGraw-Hill Companies, Inc.
f
// compute the
factorial of n
1 void main( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while(i < n) {
7
i = i + 1;
8
f = f * i;
9
}
10 }
n
i
3
3
3
undef undef
1
undef
1
1
Copyright © 2006 The McGraw-Hill Companies, Inc.
f
// compute the
factorial of n
1 void main( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while(i < n) {
7
i = i + 1;
8
f = f * i;
9
}
10 }
n
i
f
3
3
3
3
undef
1
1
1
undef
undef
1
1 (no change)
Copyright © 2006 The McGraw-Hill Companies, Inc.
// compute the
factorial of n
1 void main( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while(i < n) {
7
i = i + 1;
8
f = f * i;
9
}
10 }
n
i
f
3
3
1
2
1
1
Copyright © 2006 The McGraw-Hill Companies, Inc.
// compute the
factorial of n
1 void main( ) {
2 int n, i, f;
3 n = 3;
4 i = 1;
5 f = 1;
6 while(i < n) {
7
i = i + 1;
8
f = f * i;
9
}
10 }
n
i
f
3
2
3
Copyright © 2006 The McGraw-Hill Companies, Inc.
Discussion
(In C/C++ undefined variables are initialized
according to their types but not in every
language.)
Loop test – no side effects
Assignment statements – have side effects
The program state is altered.
Order of sub-expression evaluation will be an
issue if the sub-expressions have side
effects.
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.4 Assignment Semantics
Abstract syntax for Clite assigment statement
Assignment = Variable target: Expression source
Semantic interpretation:
• Evaluate the source expression, using current state
• Get value
• Replace value of target variable with value, getting
a new state.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Meaning Rules (Chapter 8)
Meaning rules map states to states
Meaning Rule 8.3 (page 202) The meaning of an
assignment statement is the result of replacing the
value of the target Variable by the value of the
source Expression in the current state.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Denotational Semantics (Chapter 8)
Assignment = Variable target: Expression source
M: Assignment X State → State
M(Assignment a, State state )
= state ‫<{ ט‬a.target, M(a.source, state)>}
Loose interpretation: the meaning of an assignment a
given the state state is the state in which the only
change is that the target variable of a has a new
value which is defined as the meaning of the source
expression.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Assignment Statement Issues
Issues
• Multiple assignment
• Assignment statement vs. expression
• Copy vs. reference semantics
Copyright © 2006 The McGraw-Hill Companies, Inc.
Multiple Assignment
Example:
a = b = c = 0;
Sets all 3 variables to zero.
a = b = c;
Associativity???
Copyright © 2006 The McGraw-Hill Companies, Inc.
Assignment Statement vs. A. Expression
In most languages, assignment is a statement; cannot
appear in an expression.
In most C-like languages, assignment is an expression.
• while (*p++ = *q++) ; // strcpy
• while (ch = getc(fptr)) ... // ???
• while (p = p->next) ... // ???
Expression value: the value assigned to the target var.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Ass’t Statement vs. Ass’t. Expression
if (a = 0) ... // an error ???
The meaning (semantics) of an assignment
statement is not the same as the meaning of
an assignment expression.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Copy vs. Reference Semantics
• Copy: a = b;
– a, b have same value.
– Changes to either have no effect on other.
– Used in imperative languages.
• Reference
– a, b point to the same object.
– A change in object state affects both
– Used by many object-oriented languages.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Turing Complete Languages
Textbook definition: A programming language is
Turing complete if its programs are capable of
computing any computable functions.
Or… if it can compute the same functions that a
Turing machine can compute
Turing machine = a simple, theoretical model of a
computer; can recognize all languages described by
any grammar (including the unrestricted grammars)
Importance: in theoretical CS – decidability issues
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.5 Control Flow Semantics
To be Turing complete, an imperative language needs:
• Statement sequencing
• Conditional statement
• Looping statement
Bohm-Jacopini Theorem
Copyright © 2006 The McGraw-Hill Companies, Inc.
Sequence
s1 s2
Semantics: in the absence of a branching statement,
• First execute s1
• Then execute s2
• Output state of s1 becomes the input state of s2
Branching statements: return, break, continue, goto,
…
Copyright © 2006 The McGraw-Hill Companies, Inc.
Conditional
IfStatement  if ( Expresion ) Statement
[ else Statement ]
Example:
Meaning:
if (a > b)
z = a;
else
z = b;
If the test expression is true,
then the output state of the
conditional is the output state
of the then branch,
else the output state of the
conditional is the output state
of the else branch.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Conditional
Switch (Case) Statement
Originated in Fortran as a computed goto:
goto (100, 200, 300, 400), i
Some issues for a switch statement:
• Data type of switch variable – some integral type
• Whether or not to require a break after each case
– Use of implicit break (Ada) is less error prone
• Whether or not to allow a range of values
Copyright © 2006 The McGraw-Hill Companies, Inc.
Loops
WhileStatement  while ( Expression ) Statement
The expression is evaluated.
If it is true, first the statement is executed,
and then the loop statement is executed again.
Otherwise the loop terminates.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Loop Variations
• Begin-test loop (e.g. while)
• End-test loop (e.g. do_while)
• Counting loop (e.g. for)
• Iterator loops (e.g., for-each)
– An iterator is any finite set of values over which a loop can
be repeated
– foreach $number (@ number) {
$number = $number + 1;
}
Copyright © 2006 The McGraw-Hill Companies, Inc.
Loop Variations
• foreach loops exist in several languages (e.g., Ada, C#)
• Simple form of a for loop in which there is no explicit
LCV, no requirement to specify increment
• The assumption is that every item in the collection will be
processed in order.
– Another Perl example
– foreach $key (sort keys %list) {
print $key, “\t”, $list{$key},“\n”;
}
Copyright © 2006 The McGraw-Hill Companies, Inc.
Branching Statements & Loop Control
return: exit from a function
break:
non-standard exit from a loop
continue:
non-standard exit from current loop
iteration
goto:
transfer of control to any named program
statement
Structured Programming: In part, advocates program
design based on sequence, conditional, while loops
only – no goto.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Spaghetti code?
j=0
ct = 0
sum = 0.0
100 if (j .ge. n) goto 300
j=j+1
if (a(j) .lt. 0) goto 100
ct = ct + 1
sum = sum + a(j)
goto 100
300 continue
Copyright © 2006 The McGraw-Hill Companies, Inc.
Summary
Semantics = program meaning
i.e., how the program’s state changes as it executes
This chapter presents an informal discussion
expressions, assignments, conditionals, branching,
looping statements (skipped I/O and exceptions)
Chapter 8 shows how the semantics of a language can
be defined using a set of meaning functions; a
formal statement of the semantics of various
program constructs.
Copyright © 2006 The McGraw-Hill Companies, Inc.
Examples of Meaning Functions for Clite
The meaning of a Program is … the meaning of its
body when given an initial state consisting of the
variables of the decpart, each initialized to the
undef value corresponding to its type.
The meaning of an assignment statement is the result
of replacing the value of the target Variable by
the value of the source Expression in the current
state.
Copyright © 2006 The McGraw-Hill Companies, Inc.
7.6 Input/Output Semantics
• Background Concepts
– open, close
• open binds a file to a file id, sets up tables, etc.
– Access: sequential vs. random
• Sequential access reflects order of data in file;
random access allows elements to be accessed out of
order, multiple times
– Stream vs. fixed length records
• Data is viewed as either a stream of characters or a
series of fixed length records
Copyright © 2006 The McGraw-Hill Companies, Inc.
Standard Files
• Used to represent keyboard, screen display,
and a standard error file to store run-time
errors
– Unix: stdin, stdout, stderr
– C: stdin, stdout, stderr
– C++: cin, cout, cerr
– Java: System.in, System.out, System.err
Copyright © 2006 The McGraw-Hill Companies, Inc.
Unformatted vs Formatted I/O
Unformatted I/O corresponds to what C++ and Java
refer to as streams, and Fortran calls list-directed.
Characters are read or written as a stream; input
routines group them into tokens and convert to an
internal data type based on the next list item
C++ Example: cin >> x >> y >> p;
Fortran Example: READ(*,*) x, y, p
Copyright © 2006 The McGraw-Hill Companies, Inc.
Unformatted Input/Output Streams
• Fortran code to read data into an array using list
integer :: i, a(8)
write(*,*) “Enter 8 integers: “
read(*,*) a
write(*,*) a
• Java stream types for sequential I/O:
– file: transfers data to or from a file
– pipe: transfers data to another stream
– memory: transfer data between an array and the program
Copyright © 2006 The McGraw-Hill Companies, Inc.
Formatted I/O
Requires the programmer to write descriptions of the
input/output format
C: fscanf(input, “%d”, &a[i]);
hybrid; permits arbitrary number of blanks
Fortran syntax for formatted reads and writes:
READ (file-unit,format-expr)var-list)
WRITE (file-unit,format-expr)var-list)
Fortran doesn’t automatically skip blanks
Copyright © 2006 The McGraw-Hill Companies, Inc.
Formatted I/O
WRITE (*, 120) a
120 FORMAT ( ‘Contents of array a: ‘, 8I5)
Position 1 in an output format statement controls line feeds
Blank indicates single space, ‘1’ (?) double space, anything
else is a form feed
Copyright © 2006 The McGraw-Hill Companies, Inc.
Formats
• C
– Codes: d, e, f, c, s (decimal. float, float, char, string)
– Specifier: % opt-width code
– Ex: %s %5d %20s %8.2f
• Fortran
– Codes: i, f, a (integer, float, string)
– Specifier: op-repeat code width
– Ex: 8i4, f8.2, a20
Copyright © 2006 The McGraw-Hill Companies, Inc.
Descargar

Programming Languages - UAH