Evolution of the Major Programming Languages • Introduction • The General Problem of Describing Syntax • Formal Methods of Describing Syntax • Describing the Meanings of Programs: Dynamic Semantics"> Evolution of the Major Programming Languages • Introduction • The General Problem of Describing Syntax • Formal Methods of Describing Syntax • Describing the Meanings of Programs: Dynamic Semantics">
Nothing Special   »   [go: up one dir, main page]

Chapter 1 - Principles of Programming Languges

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Chapter 1

Preliminaries

Chapter 1 Topics
Reasons for Studying Concepts of Programming Languages
Programming Domains
Language Evaluation Criteria
Influences on Language Design
Language Categories
Language Design Trade-Offs
Implementation Methods
Programming Environments
Chapter 1
Preliminaries

Reasons for Studying Concepts of Programming Languages


Increased ability to express ideas.
It is believed that the depth at which we think is influenced by the expressive
power of the language in which we communicate our thoughts.
It is difficult for people to conceptualize structures they cant describe, verbally
or in writing.
Language in which they develop software places limits on the kinds of control
structures, data structures, and abstractions they can use.
Awareness of a wider variety of programming language features can reduce
such limitations in software development.
Can language constructs be simulated in other languages that do not support those
constructs directly? Associative arrays in Perl vs. C

Improved background for choosing appropriate languages


Many programmers, when given a choice of languages for a new project, continue
to use the language with which they are most familiar, even if it is poorly suited to
new projects.
If these programmers were familiar with other languages available, they would be
in a better position to make informed language choices.

Greater ability to learn new languages


Programming languages are still in a state of continuous evolution, which means
continuous learning is essential.
Programmers who understand the concept of object oriented programming will
have easier time learning Java.
Once a thorough understanding of the fundamental concepts of languages is
acquired, it becomes easier to see how concepts are incorporated into the design
of the language being learned.

Understand significance of implementation


Understanding of implementation issues leads to an understanding of why
languages are designed the way they are.
This in turn leads to the ability to use a language more intelligently, as it was
designed to be used.
For example, programmers who know little about how recursion is implemented
often do not know that a recursive algorithm can be far slower than an equivalent
iterative algorithm.
Ability to design new languages
The more languages you gain knowledge of, the better understanding of
programming languages concepts you understand.

Overall advancement of computing


In some cases, a language became widely used, at least in part, because those in
positions to choose languages were not sufficiently familiar with programming
language concepts.
Many believe that ALGOL 60 was a better language than Fortran; however,
Fortran was most widely used. It is attributed to the fact that the programmers
and managers didnt understand the conceptual design of ALGOL 60.
Programming Domains

Scientific applications
In the early 40s computers were invented for scientific applications.
The applications require large number of floating point computations.
Fortran was the first language developed scientific applications.
ALGOL 60 was intended for the same use.
Business applications
The first successful language for business was COBOL.
Business languages are characterized by facilities for producing elaborate
reports, precise ways of describing and storing decimal numbers and character
data, and the ability to specify decimal arithmetic operations.
The arrival of PCs started new ways for businesses to use computers.
Spreadsheets and database systems were developed for business.
Artificial intelligence
Symbolic rather than numeric computations are manipulated.
Symbolic computation is more suitably done with linked lists than arrays.
LISP was the first widely used AI programming language.
An alternative approach to AI applications: Prolog
Scheme, a dialect of LISP
Systems programming
The OS and all of the programming supports tools are collectively known as its
system software.
Need efficiency because of continuous use.
A language for this domain must provide fast execution. Furthermore, it must
have low-level features that allow the software interfaces to external devices to
be written.
The UNIX operating system is written almost entirely in C.
Scripting languages
Put a list of commands, called a script, in a file to be executed.
The language, named sh (for shell), began as a small collection of commands that
were interpreted as calls to system subprograms that performed utility functions,
such as file management and simple file filtering.
awk, another scripting language, began as a report-generation language but later
became a more general-purpose language.
The Perl language, developed by Larry Wall, was originally a combination of sh
and awk.
The use of Perl rose dramatically, primarily because it is a nearly ideal language
for Common Gateway Interface (CGI) programming.
JavaScript (Flanagan, 1998) is a scripting language developed by Netscape.
JavaScript is used mostly as a client-side scripting language.
JavaScript is embedded in HTML documents and is interpreted by a browser
that finds the code in a document that is being displayed.
PHP is a scripting language used on Web server systems. Its code is embedded
in HTML documents. The code is interpreted on the server before the document
is sent to a requesting browser.
Special-purpose languages
A host of special-purpose languages have appeared over the past 40 years.
They range from RPG, which is used to produce business reports, to APT, which
is used for instructing programmable machine tools, to GPSS, which is used for
system simulation.
This book does not discuss special-purpose language.
Language Evaluation Criteria
Readability
The most important criteria for judging a programming language is the ease with which
programs can be read and understood.
Language constructs were designed more from the point of view of the computer than the
users.
Because ease of maintenance is determined in large part by the readability of programs,
readability became an important measure of the quality of programs and programming
languages. The result is a crossover from focus on machine orientation to focus on human
orientation.
The most important criterion ease of use
Overall simplicity Strongly affects readability
Too many features make the language difficult to learn. Programmers tend to learn a
subset of the language and ignore its other features.
Multiplicity of features is also a complicating characteristic having more than one
way to accomplish a particular operation. Ex Java:
count = count + 1
count += 1
count ++
++count
Although the last two statements have slightly different meaning from each other and
from the others, all four have the same meaning when used as stand-alone
expressions.
Operator overloading where a single operator symbol has more than one meaning.
Although this is a useful feature, it can lead to reduced readability if users are allowed
to create their own overloading and do not do it sensibly.
Most assembly language statements are models of simplicity.
This very simplicity, however, makes assembly language programs less readable.
Because they lack more complex control statements.

Orthogonality
Makes the language easy to learn and read.
A relatively small set of primitive constructs can be combined in a relatively small
number of ways to build the control and data structures of the language.
Every possible combination is legal and meaningful.
The more orthogonal the design of a language, the fewer exceptions the language
rules require.
Example: In C language, parameters are passed by value, unless they are arrays, in
which they are, in effect, passed by reference (because the appearance of an array
name without a subscript in a C program is interpreted to be the address of the
arrays first element).
Example: Adding two 32-bit integer values that reside in either memory or registers
and replacing on of two values with the sum.
The IBM mainframes have two instructions:
A Reg1, memory_cell
//Reg1 <- contents (Reg1) + contents(memory_cell)
AR Reg1, Reg2
//Reg1 <- contents (Reg1) + contents(Reg2)
where Reg1 and Reg2 represent registers.

The VAX addition instruction for 32-bit integer value is:


ADDL operand_1, operand_2
//operand_2 <- contents(operand_1) + contents(operand_2)
In this case, either operand can be a register or a memory cell.
The VAX instruction design is orthogonal in that a single instruction can use
either registers or memory cell as the operands. The IBM design is not
orthogonal.
Too much orthogonality can also cause problems.
Example: the most orthogonal programming language is ALGOL 68. Every
language construct in ALGOL 68 has a type, and there are no restrictions on those
types.
For example, a conditional can appear as the left side of an assignment, along
with declarations and other assorted statements, as long as the result is an
address.
This form of orthogonality leads to unnecessary complexity.

Control Statements
It became widely recognized that indiscriminate use of goto statements severely
reduced program readability.
Example: Consider the following nested loops written in C
while (incr < 20)
{
while (sum <= 100
{
sum += incr;
}
incr++;
}

if C didnt have a loop construct, this would be written as follows:

loop1:
if (incr >= 20) go to out;
loop2:
if (sum > 100) go to next;
sum += incr;
go to loop2;
next:
incr++;
go to loop1:
out:

Basic and Fortran in the early 1970s lacked the control statements that allow strong
restrictions on the use of gotos, so writing highly readable programs in those
languages was difficult.
Since then, languages have included sufficient control structures. The control
statement design of a language is now a less important factor in readability than it
was in the past.

Data Types and Structures


The presence of adequate facilities for defining data types and data structures in a
language is another significant aid to reliability.
Example: suppose a numeric type is used for an indicator flag because there is no
Boolean type in the language. In such a language, we might have an assignment such
as

timeout = 1

Whose meaning is unclear, whereas in a language that includes Boolean types


we would have

timeout = true

Syntax Considerations
The syntax of the elements of a language has a significant effect on readability.
The following are examples of syntactic design choices that affect readability:
Identifier forms: Restricting identifiers to very short lengths detracts from
readability.
Example: In Fortan 77, identifiers can have six characters at most.
Example: ANSI BASIC (1978) an identifier could consist only of a
single letter or a single letter followed by a single digit.
Special Words: Program appearance and thus program readability are strongly
influenced by the forms of a languages special words (while, class, for).
Example: C uses braces for pairing control structures. It is difficult to
determine which group is being ended.
Example: Fortran 95 and Ada allows programmers to use special
names as legal variable names. Ada uses end if to terminate a selection
construct, and end loop to terminate a loop construct.
Form and Meaning: Designing statements so that their appearance at least
partially indicates their purpose is an obvious aid to readability.
Semantic should follow directly from syntax, or form.
Example: In C the use of static depends on the context of its
appearance.
If used as a variable inside a function, it means the variable is created
at compile time.
If used on the definition of a variable that is outside all functions, it
means the variable is visible only in the file in which its definition
appears. It is not exported from that file.
Writability
It is a measure of how easily a language can be used to create programs for a chosen
problem domain.
Most of the language characteristics that affect readability also affect writability.
Simplicity and orthogonality
A smaller number of primitive constructs and a consistent set of rules for
combining them (that is, orthogonality) is much better than simply having a large
number of primitives.
A programmer can design a solution to a complex problem after learning only a
simple set of primitive constructs.
Support for abstraction
Abstraction means the ability to define and then use complicated structures or
operations in ways that allow many of the details to be ignored.
Programming languages can support two distinct categories of abstraction,
process and data.
A simple example of process abstraction is the use of subprogram to implement
a sort algorithm that is required several times in a program. Without the
subprogram, the sort code would have to be replicated in all places where it was
needed.
As an example of data abstraction, consider a binary tree that stores integer
data in its nodes. In Fortran 77, three parallel integer arrays, where two of these
integers are used as subscripts to specify offspring nodes. In C++ and Java, theses
trees can be implemented by using an abstraction of a tree node in the form of a
simple class with two pointers (or references) and an integer.
Expressivity
It means that a language has relatively convenient, rather than cumbersome, ways
of specifying computations.
Ex: ++count count = count + 1 // more convenient and shorter
Reliability
A program is said to be reliable if it performs to its specifications under all conditions.
Type checking: is simply testing for type errors in a given program, either by the
compiler or during program execution.
The earlier errors are detected, the less expensive it is to make the required
repairs. Java requires type checking of nearly all variables and expressions at
compile time.
Exception handling: the ability to intercept run-time errors, take corrective measures,
and then continue is a great aid to reliability.
Aliasing: it is having two or more distinct referencing methods, or names, for the same
memory cell. In C, union members and pointers set to point to the same variable.
It is now widely accepted that aliasing is a dangerous feature in a language.
Readability and writability: Both readability and writability influence reliability.
Cost
Categories
Training programmers to use language
Writing programs
Compiling programs
Executing programs
Language implementation system Free compilers is the key, success of Java
Reliability, does the software fail?
Maintaining programs: Maintenance costs can be as high as two to four times as
much as development costs.
Portability standardization of the language
Generality (the applicability to a wide range of applications)
Influences on Language Design
Computer architecture: Von Neumann
We use imperative languages, at least in part, because we use von Neumann machines
Data and programs stored in same memory
Memory is separate from CPU
Instructions and data are piped from memory to CPU
Results of operations in the CPU must be moved back to memory
Basis for imperative languages
Variables model memory cells
Assignment statements model piping
Iteration is efficient

Programming methodologies
Late 1960s: Procedure-oriented
People efficiency became important; readability, better control structures
Structured programming
Top-down design and step-wise refinement
Late 1970s: Procedure-oriented to data-oriented
data abstraction
early 1980s: Object-oriented programming
Language Categories
Imperative
Central features are variables, assignment statements, and iteration
C, Pascal
Functional
Main means of making computations is by applying functions to given parameters
LISP, Scheme
Logic
Rule-based
Rules are specified in no special order
Prolog
Object-oriented
Encapsulate data objects with processing
Inheritance and dynamic type binding
Grew out of imperative languages
C++, Java

Language Design Trade-offs


Reliability vs. cost of execution
Example: C programs execute faster than semantically equivalent Java programs,
although Java programs are more reliable.
Java demands all references to array elements be checked for proper indexing but
that leads to increased execution costs.
C does not require index range checking
Implementation Methods
The major methods of implementing programming languages are compilation, pure
interpretation, and hybrid implementation
Compilation: Programs are translated into machine language
Pure Interpretation: Programs are interpreted by another program known as an
interpreter
Hybrid Implementation Systems: A compromise between compilers and pure
interpreters
The operating system and language implementation are layered over machine interface
of a computer.
These layers can be thought of as virtual computers, providing interfaces to the user at
higher levels

Figure 1.2 Layered View of Computer


Layered interface of virtual computers, provided by a typical computer system
Compilation
Translate high-level program (source language) into machine code (machine language)
Slow translation, fast execution
C, COBOL, and Ada are by compilers.
Compilation process has several phases:
lexical analysis: converts characters in the source program into lexical units
The lexical units of a program are identifiers, special words operators, and
punctuation symbols.
Syntax analysis: transforms lexical units into parse trees
These parse trees represent the syntactic structure of program
Semantics analysis: generate intermediate code
Intermediate languages sometimes look very much like assembly
languages and in fact sometimes are actual assembly language.
Symbol table: the type and attribute information of each user-defined name in
the program
Optimization: improve programs by making them smaller or faster or both
Code generation: machine code is generated

Figure 1.3 The Compilation Process


Fetch-execute-cycle (on a von Neumann architecture)

initialize the program counter


repeat forever
fetch the instruction pointed by the counter
increment the counter
decode the instruction
execute the instruction
end repeat

Von Neumann Bottleneck


The speed of the connection between a computers memory and its processor
determines the speed of a computer.
Instructions often can be executed a lot faster than they can be moved to the
processor for execution.
This connection speed is called the von Neumann bottleneck; it is the primary
limiting factor in the speed of computers
Pure Interpretation
Programs are interpreted by another program called an interpreter, with no translation.
Advantage: easy implementation of many source-level debugging operations, because all
run-time error messages can refer to source-level units.
Disadvantage: slower execution (10 to 100 times slower than compiled programs)
Bottleneck: Statement decoding, rather than the connection between the processor and
memory, is the bottleneck of a pure interpreter.
Significant comeback with some Web scripting languages (e.g., JavaScript and PHP).

Figure 1.4 Pure Interpretation Process


Hybrid Implementation Systems
A compromise between compilers and pure interpreters
A high-level language program is translated to an intermediate language that allows
easy interpretation
Faster than pure interpretation
Examples:
Perl programs are partially compiled to detect errors before interpretation
Java were hybrid: the intermediate form, byte code, provides portability to any
machine that has a byte code interpreter and a run-time system (together, these are
called Java Virtual Machine)

Figure 1.5 Hybrid Implementation System


Programming Environments
The collection of tools used in software development
UNIX
An older operating system and tool collection
Borland JBuilder
An integrated development environment for Java
Microsoft Visual Studio.NET
A large, complex visual environment
Used to program in C#, Visual BASIC.NET, Jscript, J#, or C++
Summary
The study of programming language is valuable for a number of important reasons:
Increases our capacity to use different constructs in writing programs
Enables us to choose languages for projects more intelligently
Makes learning new languages easier
Among the most important criteria for evaluating languages are:
Readability
Writability
Reliability
Overall cost
The major methods of implementing program languages are
Compilation
Pure interpretation
Hybrid implementation
Chapter 2
Evolution of the Major Programming Languages
Chapter 3
Describing Syntax and Semantics

Chapter 3 Topics
Introduction
The General Problem of Describing Syntax
Formal Methods of Describing Syntax
Describing the Meanings of Programs: Dynamic Semantics
Chapter 3
Describing Syntax and Semantics

Introduction
Syntax the form of the expressions, statements, and program units
Semantics - the meaning of the expressions, statements, and program units.
Ex:
while (<Boolean_expr>)<statement>
The semantics of this statement form is that when the current value of the Boolean
expression is true, the embedded statement is executed.
The form of a statement should strongly suggest what the statement is meant to
accomplish.

The General Problem of Describing Syntax


A sentence or statement is a string of characters over some alphabet. The syntax rules of
a language specify which strings of characters from the languages alphabet are in the
language.
A language is a set of sentences.
A lexeme is the lowest level syntactic unit of a language. It includes identifiers, literals,
operators, and special word. (e.g. *, sum, begin) A program is strings of lexemes.
A token is a category of lexemes (e.g., identifier.) An identifier is a token that have
lexemes, or instances, such as sum and total.
Ex:
index = 2 * count + 17;

Lexemes Tokens
index identifier
= equal_sign
2 int_literal
* mult_op
count identifier
+ plus_op
17 int_literal
; semicolon
Language Recognizers and Generators
In general, language can be formally defined in two distinct ways: by recognition and by
generation.
Language Recognizers:
o A recognition device reads input strings of the language and decides whether the
input strings belong to the language.
o It only determines whether given programs are in the language.
o Example: syntax analyzer part of a compiler. The syntax analyzer, also known as
parsers, determines whether the given programs are syntactically correct.
Language Generators:
o A device that generates sentences of a language
o One can determine if the syntax of a particular sentence is correct by comparing it to
the structure of the generator
Formal Methods of Describing Syntax
The formal language generation mechanisms are usually called grammars
Grammars are commonly used to describe the syntax of programming languages.

Backus-Naur Form and Context-Free Grammars


It is a syntax description formalism that became the most widely used method for
programming language syntax.

Context-free Grammars
Developed by Noam Chomsky in the mid-1950s who described four classes of generative
devices or grammars that define four classes of languages.
Context-free and regular grammars are useful for describing the syntax of programming
languages.
Tokens of programming languages can be described by regular grammars.
Whole programming languages can be described by context-free grammars.

Backus-Naur Form (1959)


Invented by John Backus to describe ALGOL 58 syntax.
BNF (Backus-Naur Form) is equivalent to context-free grammars used for describing
syntax.

Fundamentals
A metalanguage is a language used to describe another language Ex: BNF.
In BNF, abstractions are used to represent classes of syntactic structures--they act like
syntactic variables (also called nonterminal symbols)

<assign> <var> = <expression>

This is a rule; it describes the structure of an assignment statement


A rule has a left-hand side (LHS) The abstraction being defined and a right-hand side
(RHS) consists of some mixture of tokens, lexemes and references to other
abstractions, and consists of terminal and nonterminal symbols.
Example:

total = sub1 + sub2

A grammar is a finite nonempty set of rules and the abstractions are called nonterminal
symbols, or simply nonterminals.
The lexemes and tokens of the rules are called terminal symbols or terminals.
A BNF description, or grammar, is simply a collection of rules.
An abstraction (or nonterminal symbol) can have more than one RHS

<stmt> <single_stmt>
| begin <stmt_list> end
Multiple definitions can be written as a single rule, with the different definitions
separated by the symbol |, meaning logical OR.

Describing Lists

Syntactic lists are described using recursion.

<ident_list> ident
| ident, <ident_list>

A rule is recursive if its LHS appears in its RHS.

Grammars and derivations

The sentences of the language are generated through a sequence of applications of the
rules, beginning with a special nonterminal of the grammar called the start symbol.
A derivation is a repeated application of rules, starting with the start symbol and ending
with a sentence (all terminal symbols)
An example grammar:

<program> <stmts>
<stmts> <stmt> | <stmt> ; <stmts>
<stmt> <var> = <expr>
<var> a | b | c | d
<expr> <term> + <term> | <term> - <term>
<term> <var> | const

An example derivation for a simple statement a = b + const

<program> => <stmts> => <stmt>


=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const

Every string of symbols in the derivation, including <program>, is a sentential form.


A sentence is a sentential form that has only terminal symbols.
A leftmost derivation is one in which the leftmost nonterminal in each sentential form is
the one that is expanded. The derivation continues until the sentential form contains no
nonterminals.
A derivation may be neither leftmost nor rightmost.
Parse Trees

Hierarchical structures of the language are called parse trees.


A parse tree for the simple statement A = B + const

<program>

<stmts>

<stmt>

<var> = <expr>

a <term> + <term>

<var> const

Ambiguity
A grammar is ambiguous if it generates a sentential form that has two or more distinct
parse trees.
Ex: Two distinct parse trees for the same sentence, const const / const

<expr> <expr> <op> <expr> | const


<op> / | -

<expr> <expr>

<expr> <op> <expr> <expr> <op> <expr>

<expr> <op> <expr> <expr> <op> <expr>

const - const / const const - const / const


Ex: Two distinct parse trees for the same sentence, A = B + A * C

<assign> <id> = <expr>


<id> A|B|C
<expr> <expr> + <expr>
| <expr> * <expr>
| (<expr>)
| <id>

Operator Precedence
The fact that an operator in an arithmetic expression is generated lower in the parse tree
can be used to indicate that it has higher precedence over an operator produced higher
up in the tree.
In the left parsed tree above, one can conclude that the * operator has precedence over the
+ operator. How about the tree on the right hand side?
An unambiguous Grammar for Expressions

<assign> <id> = <expr>


<id> A|B|C
<expr> <expr> + <term>
| <term>
<term> <term> * <factor>
| <factor>
<factor> (<expr>)
| <id>

Leftmost derivation of the sentence A = B + C * A


<assing> => <id> => <expr>
=> A = <expr>
=> A = <expr> + <term>
=> A = <term> + <term>
=> A = <factor> + <term>
=> A = <id> + <term>
=> A = B + <term>
=> A = B + <term> * <factor>
=> A = B + <factor> * <factor>
=> A = B + <id> * <factor>
=> A = B + C * <factor>
=> A = B + C * <id>
=> A = B + C * A

A parse tree for the simple statement, A = B + C * A


Rightmost derivation of the sentence A = B + C * A
<assing> => <id> => <expr>
=> <id> = <expr> + <term>
=> <id> = <expr> + <term> * <factor>
=> <id> = <expr> + <term> * <id>
=> <id> = <expr> + <term> * A
=> <id> = <expr> + <factor> * A
=> <id> = <expr> + <id> * A
=> <id> = <expr> + C * A
=> <id> = <term> + C * A
=> <id> = <factor> + C * A
=> <id> = <id> + C * A
=> <id> = B + C * A
=> A = B + C * A
Both of these derivations, however, are represented by the same parse tree.
Associativity of Operators

Do parse trees for expressions with two or more adjacent occurrences of operators with
equal precedence have those occurrences in proper hierarchical order?
An example of an assignment using the previous grammar is:

A=B+C+A

Figure above shows the left + operator lower than the right + operator. This is the correct
order if + operator meant to be left associative, which is typical.
When a grammar rule has LHS also appearing at beginning of its RHS, the rule is said to
be left recursive. The left recursion specifies left associativity.
In most languages that provide it, the exponentiation operator is right associative. To
indicate right associativity, right recursion can be used. A grammar rule is right
recursive if the LHS appears at the right end of the RHS. Rules such as

<factor> <exp> ** <factor>


| <exp>
<exp> (<exp> )
| id
Extended BNF

Because of minor inconveniences in BNF, it has been extended in several ways. EBNF
extensions do not enhance the descriptive powers of BNF; they only increase its readability
and writability.
Optional parts are placed in brackets ([ ])

<proc_call> -> ident [ ( <expr_list>)]

Put alternative parts of RHSs in parentheses and separate them with vertical bars (|, OR
operator)

<term> -> <term> (+ | -) const

Put repetitions (0 or more) in braces ({ })

<ident> -> letter {letter | digit}

BNF:
<expr> <expr> + <term>
| <expr> - <term>
| <term>
<term> <term> * <factor>
| <term> / <factor>
| <factor>
<factor> <exp> ** <factor>
| <exp>
<exp> (<expr>)
| id

EBNF:
<expr> <term> {(+ | -) <term>}
<term> <factor> {(* | /) <factor>}
<factor> <exp> {** <factor>}
<exp> (<expr>)
| id
Describing the Meanings of Programs: Dynamic Semantics

Axiomatic Semantics

Axiomatic Semantics was defined in conjunction with the development of a method to prove
the correctness of programs.
Such correction proofs, when they can be constructed, show that a program performs the
computation described by its specification.
In a proof, each statement of a program is both preceded and followed by a logical
expression that specified constraints on program variables.
Approach: Define axioms or inference rules for each statement type in the language (to allow
transformations of expressions to other expressions.)
The expressions are called assertions.

Assertions

Axiomatic semantics is based on mathematical logic. The logical expressions are called
predicates, or assertions.
An assertion before a statement (a precondition) states the relationships and constraints
among variables that are true at that point in execution.
An assertion following a statement is a postcondition.
A weakest precondition is the least restrictive precondition that will guarantee the
validity of the associated postcondition.

Pre-post form: {P} statement {Q}

An example: a = b + 1 {a > 1}

One possible precondition: {b > 10}


Weakest precondition: {b > 0}

If the weakest precondition can be computed from the given postcondition for each
statement of a language, then correctness proofs can be constructed from programs in that
language.
Program proof process: The postcondition for the whole program is the desired result.
Work back through the program to the first statement. If the precondition on the first
statement is the same as the program spec, the program is correct.
An Axiom is a logical statement that is assumed to be true.
An Inference Rule is a method of inferring the truth of one assertion on the basis of the
values of other assertions.
Assignment Statements

Ex:

a=b/21 {a < 10}

The weakest precondition is computed by substituting b / 2 -1 in the assertion {a < 10} as


follows:

b/21 < 10
b/2 < 11
b < 22

the weakest precondition for the given assignment and the postcondition is {b < 22}

An assignment statement has a side effect if it changes some variable other than its left
side.
Ex:

x = 2 * y 3 {x > 25}
2 * y 3 > 25
2 * y > 28
y > 14

the weakest precondition for the given assignment and the postcondition is {y > 14}

Ex:

x = x + y 3 {x > 10}
x + y 3 > 10
y > 13 x

Sequences

The weakest precondition for a sequence of statements cannot be described by an axiom,


because the precondition depends on the particular kinds of statements in the sequence.
In this case, the precondition can only be described with an inference rule.
Ex:

y = 3 * x + 1;
x = y + 3; {x < 10}

y + 3 < 10
y<7
3*x+1<7
3*x<6
x<2
The precondition for the first assignment statement is {x < 2}

Selection

Example of selection statement is

If (x > 0)
y = y - 1;
else y = y + 1;
{y > 0}

We can use the axiom for assignment on the then clause


y = y - 1 {y > 0}
This produce precondition {y 1 > 0} or {y > 1}
No we apply the same axiom to the else clause
y = y + 1 {y > 0}
This produce precondition {y + 1 > 0} or {y > -1}
y > 1 AND y > -1
{y > 1}
Because {y > 1} => {y > -1}, the rule of consequence allows us to use {y > 1} for the
precondition of selection statement.

Logical Pretest Loops

Computing the weakest precondition (wp) for a while loop is inherently more difficult
than for a sequence b/c the number of iterations cant always be predetermined.
The corresponding step in the axiomatic semantics of a while loop is finding an assertion
called a loop invariant, which is crucial to finding the weakest precondition.
It is helpful to treat the process of producing the wp as a function, wp.
To find I, we use the loop postcondition to compute preconditions for several different
numbers of iterations of the loop body, starting with none. If the loop body contains a
single assignment statement, the axiom for assignment statements can be used to compute
these cases.

wp(statement, postcondition) = precondition

Ex:

while y <> x do y = y + 1 end {y = x}

For 0 iterations, the wp is {y = x}


For 1 iteration,
wp(y = y + 1, {y = x}) = {y + 1 = x}, or {y = x 1}
For 2 iterations,
wp(y = y + 1, {y = x - 1}) = {y + 1 = x - 1}, or {y = x 2}
For 3 iterations,
wp(y = y + 1, {y = x - 2}) = {y + 1 = x - 2}, or {y = x 3}

It is now obvious that {y < x} will suffice for cases of one or more iterations. Combining
this with {y = x} for the 0 iterations case, we get {y <= x} which can be used for the loop
invariant.

Ex:

while s > 1 do s = s / 2 end {s = 1}

For 0 iterations, the wp is {s = 1}


For 1 iteration,
wp(s > 1, {s = s / 2}) = {s / 2 = 1}, or {s = 2}
For 2 iterations,
wp(s > 1, {s = s / 2}) = {s / 2 = 2}, or {s = 4}
For 3 iterations,
wp(s > 1, {s = s / 2}) = {s / 2 = 4}, or {s = 8}

Loop Invariant I is {s is a nonnegative power of 2}


The loop invariant I is a weakened version of the loop postcondition, and it is also a
precondition.
I must be weak enough to be satisfied prior to the beginning of the loop, but when
combined with the loop exit condition, it must be strong enough to force the truth of
the postcondition.

You might also like