© Springer International Publishing AG 2017
Gerard O'ReganConcise Guide to Software EngineeringUndergraduate Topics in Computer Science10.1007/978-3-319-57750-0_13

13. Z Formal Specification Language

Gerard O’Regan 
(1)
SQC Consulting, Cork, Ireland
 
 
Gerard O’Regan
Abstract
This chapter presents the Z specification language, which is one of the most widely used formal methods. Z is a formal specification language based on Zermelo set theory. It was developed at the Programming Research Group at Oxford University in the early 1980s. Z specifications are mathematical and employ a classical two-valued logic. The use of mathematics ensures precision and allows inconsistencies and gaps in the specification to be identified. Theorem provers may be employed to demonstrate that the software implementation meets its specification.
Keywords
Sets, relations and functionsBags and sequencesPreconditionPost-conditionInvariantData reificationRefinementSchema calculusProof in Z

13.1 Introduction

Z is a formal specification language based on Zermelo set theory. It was developed at the Programming Research Group at Oxford University in the early 1980s [1] and became an ISO standard in 2002. Z specifications are mathematical and employ a classical two-valued logic. The use of mathematics ensures precision and allows inconsistencies and gaps in the specification to be identified. Theorem provers may be employed to prove properties of the specification and to demonstrate that the software implementation meets its specification.
Z is a “and an explicit model” approach with an explicit model of the state of an abstract machine given, and operations are defined in terms of this state. Its mathematical notation is used for formal specification, and its schema calculus is used to structure the specification. The schema calculus is visually striking and consists essentially of boxes, with these boxes or schemas used to describe operations and states. The schemas may be used as building blocks and combined with other schemas. The simple schema below (Fig. 13.1) is the specification of the positive square root of a real number.
A447511_1_En_13_Fig1_HTML.gif
Fig. 13.1
Specification of positive square root
The schema calculus is a powerful means of decomposing a specification into smaller pieces or schemas. This helps to make Z specifications highly readable, as each individual schema is small in size and self-contained. Exception handling is addressed by defining schemas for the exception cases. These are then combined with the original operation schema. Mathematical data types are used to model the data in a system, and these data types obey mathematical laws. These laws enable simplification of expressions and are useful with proofs.
Operations are defined in a precondition/post-condition style. A precondition must be true before the operation is executed, and the post-condition must be true after the operation has executed. The precondition is implicitly defined within the operation. Each operation has an associated proof obligation to ensure that if the precondition is true, then the operation preserves the system invariant. The system invariant is a property of the system that must be true at all times. The initial state itself is, of course, required to satisfy the system invariant.
The precondition for the specification of the square root function above is that num? ≥ 0; i.e., the function SqRoot may be applied to positive real numbers only. The post-condition for the square root function is root!2 = num? and root! ≥ 0. That is, the square root of a number is positive and its square gives the number. Post-conditions employ a logical predicate which relates the prestate to the post-state, with the post-state of a variable being distinguished by priming the variable, e.g. v′.
Z is a typed language, and whenever a variable is introduced, its type must be given. A type is simply a collection of objects, and there are several standard types in Z. These include the natural numbers ℕ, the integers ℤ and the real numbers ℝ. The declaration of a variable x of type X is written as x:X. It is also possible to create your own types in Z.
Various conventions are employed within Z specification: for example, v? indicates that v is an input variable, and v! indicates that v is an output variable. The variable num? is an input variable, and root! is an output variable in the square root schema above. The notation Ξ Op in a schema indicates that the operation Op does not affect the state, whereas the notation ∆Op in the schema indicates that Op is an operation that affects the state.
Many of the data types employed in Z have no counterpart in standard programming languages. It is therefore important to identify and describe the concrete data structures that ultimately will represent the abstract mathematical structures. As the concrete structures may differ from the abstract, the operations on the abstract data structures may need to be refined to yield operations on the concrete data that yield equivalent results. For simple systems, direct refinement (i.e. one step from abstract specification to implementation) may be possible; in more complex systems, deferred refinement1 is employed, where a sequence of increasingly concrete specifications are produced to yield the executable specification. There is a calculus for combining schemas to make larger specifications, and this is discussed later in the chapter.
Example 13.1
The following is a Z specification to borrow a book from a library system. The library is made up of books that are on the shelf; books that are borrowed; and books that are missing. The specification models a library with sets representing books on the shelf, on loan or missing. These are three mutually disjoint subsets of the set of books Bkd-Id. The system state is defined in the Library schema (Fig. 13.2), and operations such as Borrow and Return affect the state. The Borrow operation is specified in Fig. 13.3.
A447511_1_En_13_Fig2_HTML.gif
Fig. 13.2
Specification of a library system
A447511_1_En_13_Fig3_HTML.gif
Fig. 13.3
Specification of borrow operation
The notation ℙBkd-Id is used to represent the power set of Bkd-Id (i.e. the set of all subsets of Bkd-Id). The disjointness condition for the library is expressed by the requirement that the pairwise intersection of the subsets on-shelf, borrowed and missing is the empty set.
The precondition for the Borrow operation is that this book must be available on the shelf to borrow. The post-condition is that the borrowed book is added to the set of borrowed books and is removed from the books on the shelf.
Z has been successfully applied in industry including the CICS project at IBM Hursley in the UK.2 Next, we describe key parts of Z including sets, relations, functions, sequences and bags.

13.2 Sets

A set is a collection of well-defined objects, and this section focuses on their use in Z. Sets may be enumerated by listing all of their elements. Thus, the set of all even natural numbers less than or equal to 10 is as follows:
$$ \left\{ {2, 4, 6, 8, 10} \right\} $$
Sets may be created from other sets using set comprehension, i.e. stating the properties that its members must satisfy. For example, the set of even natural numbers less than or equal to 10 is given by set comprehension as follows:
$$ \{ n:{\mathbb{N}}|n \ne 0 \wedge n \le 10 \wedge n\,\bmod \, 2= 0 \cdot n\} $$
There are three main parts to the set comprehension above. The first part is the signature of the set, and this is given by n:ℕ. The first part is separated from the second part by a vertical line. The second part is given by a predicate, and for this example, the predicate is n ≠ 0 ∧ n ≤ 10 ∧ n mod 2 = 0. The second part is separated from the third part by a bullet. The third part is a term, and for this example, it is simply n. The term is often a more complex expression, e.g. log(n 2).
In mathematics, there is just one empty set ∅. However, there is an empty set for each type of set in Z (as Z is a typed language), and so there are an infinite number of empty sets in Z. The empty set is written as ∅ [X] where X is the type of the empty set. However, in practice, X is omitted when the type is clear.
Various set operations such as union, intersection, set difference and symmetric difference are employed in Z. The power set of a set X is the set of all subsets of X, and it is denoted by ℙX. The set of non-empty subsets of X is denoted by ℙ1X where
$$ {\mathbb{P}}_{ 1} {\text{X}} = = \{ {\text{U}}:{\mathbb{P}}{\text{X}}|{\text{U}} \ne {\oslash }\,[{\text{X}}]\} $$
A finite set of elements of type X (denoted by F X) is a subset of X that cannot be put into a one to one correspondence with a proper subset of itself. That is:
A447511_1_En_13_Equa_HTML.gif
The expression A447511_1_En_13_Figa_HTML.gif denotes that f is a bijection from U to V, and injective, surjective and bijective functions are discussed in [2].
The fact that Z is a typed language means that whenever a variable is introduced (e.g. in quantification with ∀ and ∃), it is first declared. For example, ∀j:J · P ⇒ Q. There is also the unique existential quantifier ∃1 j:J|P which states that there is exactly one j of type J that has property P.

13.3 Relations

Relations are used extensively in Z, and a relation R between X and Y is any subset of the Cartesian product of X and Y, i.e., R ⊆ (X × Y). A relation in Z is denoted by R: X ↔Y, and the notation xy indicates that the pair (x, y) ∈ R.
Consider the relation home_owner: PersonHome that exists between people and their homes. An entry daphne ↦ mandalayhome_owner if daphne is the owner of mandalay. It is possible for a person to own more than one home:
$$\begin{array}{*{20}l} {rebecca \mapsto nirvana \in home\_owner} \hfill \\ {rebecca \mapsto tivoli \in home\_owner} \hfill \\ \end{array}$$
It is possible for two people to share ownership of a home:
$$\begin{array}{*{20}l} {rebecca \mapsto nirvana \in home\_owner} \hfill \\ {lawrence \mapsto nirvana \in home\_owner} \hfill \\ \end{array}$$
There may be some people who do not own a home, and there is no entry for these people in the relation home_owner. The type Person includes every possible person, and the type Home includes every possible home. The domain of the relation home_owner is given by:
$$x \in {\text{dom}}\,home\_owner \Leftrightarrow \exists h:Home \cdot x \mapsto h \in home\_owner.$$
The range of the relation home_owner is given by:
$$h \in {\text{ran}}\,home\_owner \Leftrightarrow \exists x:Person \cdot x \mapsto h \in home\_owner.$$
The composition of two relations home_owner: PersonHome and home_value: HomeValue yields the relation owner_wealth: PersonValue and is given by the relational composition home_owner; home_value where
$$\begin{array}{*{20}l} {p \mapsto v \in home\_owner;home\_value \Leftrightarrow } \hfill \\ {(\exists h:Home \cdot p \mapsto h \in home\_owner \wedge h \mapsto v \in home\_value)} \hfill \\ \end{array}$$
The relational composition may also be expressed as:
$$owner\_wealth = home\_value\,{\text{o}}\,home\_owner$$
The union of two relations often arises in practice. Suppose a new entry aisling ↦ muckross is to be added. Then, this is given by
$$home\_owner^{{\prime }} = home\_owner \cup \{ aisling \mapsto muckross\}$$
Suppose that we are interested in knowing all females who are house owners. Then, we restrict the relation home_owner so that the first element of all ordered pairs has to be female. Consider female: ℙ Person with {aisling, rebecca} ⊆ female.
$$\begin{array}{*{20}c} {home\_owner = \, \{ aisling \mapsto muckross,rebecca \mapsto nirvana,} \\ {lawrence \mapsto nirvana\} } \\ \end{array}$$
$$female \triangleleft home\_owner = \{aisling \mapsto muckross,\ rebecca \mapsto nirvana\}$$
That is, $$female \triangleleft home\_owner$$ is a relation that is a subset of home_owner, such that the first element of each ordered pair in the relation is female. The operation $$\triangleleft$$ is termed domain restriction, and its fundamental property is:
$$x \mapsto y \in U \triangleleft R \Leftrightarrow (x \in U \wedge x \mapsto y \in {\text{R}}\}$$
where R: X ↔Y and U: ℙX.
There is also a domain anti-restriction (subtraction) operation, and its fundamental property is:
A447511_1_En_13_Equb_HTML.gif
where R: X ↔Y and U: ℙX.
There are also range restriction (the $$\triangleright$$ operator) and the range anti-restriction operator (the A447511_1_En_13_Figb_HTML.gifoperator). These are discussed in [1].

13.4 Functions

A function is an association between objects of some type X and objects of another type Y such that given an object of type X, there exists only one object in Y associated with that object [1]. A function is a set of ordered pairs where the first element of the ordered pair has at most one element associated with it. A function is therefore a special type of relation, and a function may be total or partial.
A total function has exactly one element in Y associated with each element of X, whereas a partial function has at most one element of Y associated with each element of X (there may be elements of X that have no element of Y associated with them). A partial function from X to Y(f: X $${ \nrightarrow }$$ Y) is a relation f: XY such that:
$$\forall x:{\text{X}};y,z:{\text{Y}}\cdot(x \mapsto y \in f \wedge x \mapsto z \in f \Rightarrow y = z)$$
The association between x and y is denoted by f(x) = y, and this indicates that the value of the partial function f at x is y. A total function from X to Y (denoted f: XY) is a partial function such that every element in X is associated with some value of Y.
$$f:X \to Y \Leftrightarrow f:X{ \nrightarrow }Y \wedge {\text{dom}}\,f = X$$
Clearly, every total function is a partial function but not vice versa.
A447511_1_En_13_Figc_HTML.gif
One operation that arises quite frequently in specifications is the function override operation. Consider the specification of a temperature map above and an example temperature map given by temp = {Cork ↦ 17, Dublin ↦ 19, London ↦ 15}. Then, consider the problem of updating the temperature map if a new temperature reading is made in Cork, e.g. {Cork ↦ 18}. Then, the new temperature chart is obtained from the old temperature chart by function override to yield {Cork ↦ 18, Dublin ↦ 19, London ↦ 15}. This is written as follows:
$$temp^{{\prime }} = temp{ \oplus }\{ Cork \mapsto 1 8\}$$
The function override operation combines two functions of the same type to give a new function of the same type. The effect of the override operation is that the entry {Cork ↦ 17} is removed from the temperature chart and replaced with the entry {Cork ↦ 18}.
Suppose f, g: XY are partial functions, then fg is defined and indicates that f is overridden by g. It is defined as follows:
$$\begin{array}{*{20}l} {(f{ \oplus }g)(x) = g(x) {\text{where}}\,x \in {\text{dom}}\,g} \hfill \\ {(f{ \oplus }g)(x) = f(x) {\text{where}}\,x \notin {\text{dom}}\,g \wedge x \in {\text{dom}}\,f} \hfill \\ \end{array}$$
This may also be expressed (using domain anti-restriction) as follows:
A447511_1_En_13_Equc_HTML.gif
There is notation in Z for injective, surjective and bijective functions. An injective function is one to one, i.e.
$$f\left( x \right) = f\left( y \right) \Rightarrow x = y$$
A surjective function is onto, i.e.
Given yY, ∃xX such that f(x) = y
A bijective function is one to one and onto, and it indicates that the sets X and Y can be put into one to one correspondence with one another. Z includes lambda calculus notation (λ calculus is discussed in [2]) to define functions. For example, the function cube = λx:N · x * x * x. Function composition is f; g is similar to relational composition.

13.5 Sequences

The type of all sequences of elements drawn from a set X is denoted by seq X. Sequences are written as $$\left\langle {x_{ 1} ,x_{ 2} , \ldots \ x_{n} } \right\rangle$$, and the empty sequence is denoted by 〈〉. Sequences may be used to specify the changing state of a variable over time, with each element of the sequence representing the value of the variable at a discrete time instance.
Sequences are functions, and a sequence of elements drawn from a set X is a finite function from the set of natural numbers to X. A finite partial function f from X to Y is denoted by f: X A447511_1_En_13_Figd_HTML.gif Y.
A finite sequence of elements of X is given by f: N A447511_1_En_13_Fige_HTML.gif X, and the domain of the function consists of all numbers between 1 and #f (where #f is the cardinality of f). It is defined formally as follows:
A447511_1_En_13_Equd_HTML.gif
The sequence $$\left\langle {x_{ 1} ,x_{ 2} , \ldots \ x_{n} } \right\rangle$$ above is given by:
$$\{ 1\mapsto x_{ 1} , 2\mapsto x_{ 2} , \ldots\ n \mapsto x_{n} \}$$
There are various functions to manipulate sequences. These include the sequence concatenation operation. Suppose $$\upsigma = \left\langle {x_{ 1} ,x_{ 2} , \ldots \ x_{n} } \right\rangle$$ and $$\uptau = \left\langle {y_{ 1} , \, y_{ 2} , \ldots \ y_{m} } \right\rangle$$, then:
$$\upsigma \cap\uptau = \left\langle {x_{ 1} ,x_{ 2} , \ldots\ x_{n} ,y_{ 1} ,y_{ 2} , \ldots \ y_{m} } \right\rangle$$
The head of a non-empty sequence gives the first element of the sequence:
$${\text{heads }}\sigma = {\text{head}}\left\langle {x_{ 1} ,x_{ 2} , \ldots \ x_{n} } \right\rangle = x_{ 1}$$
The tail of a non-empty sequence is the same sequence except that the first element of the sequence is removed:
$${\text{tail}}\,\sigma = {\text{tail}}\left\langle {x_{ 1} ,x_{ 2} , \ldots \ x_{n} } \right\rangle = \left\langle {x_{ 2} , \ldots \ x_{n} } \right\rangle$$
Suppose f: X → Y and a sequence σ: seq X, then the function map applies f to each element of σ:
$${\text{map}}\,f\sigma = {\text{map}}\,f\left\langle {x_{ 1} , \, x_{ 2} , \ldots \ x_{n} } \right\rangle = \left\langle {f\left( {x_{ 1} } \right),f\left( {x_{ 2} } \right), \ldots \ f\left( {x_{n} } \right)} \right\rangle$$
The map function may also be expressed via function composition as:
$${\text{map}}\,f\,\sigma = \sigma ;f$$
The reverse order of a sequence is given by the rev function:
$${\text{rev}}\,\sigma = {\text{rev}}\left\langle {x_{ 1} ,x_{ 2} , \ldots \ x_{n} } \right\rangle = \left\langle {x_{n} , \ldots \ x_{ 2} ,x_{ 1} } \right\rangle$$

13.6 Bags

A bag is similar to a set except that there may be multiple occurrences of each element in the bag. A bag of elements of type X is defined as a partial function from the type of the elements of the bag to positive whole numbers. The definition of a bag of type X is:
$${\text{bag}}\,X = X{ \nrightarrow }{\mathbb{N}}_{ 1} .$$
For example, a bag of marbles may contain 3 blue marbles, 2 red marbles and 1 green marble. This is denoted by B = [−b, b, b, g, r, r]. The bag of marbles is thus denoted by:
$${\text{bag}}\,Marble = Marble{ \nrightarrow }{\mathbb{N}}_{ 1} .$$
The function count determines the number of occurrences of an element in a bag. For the example above, count Marble b = 3 and count Marble y = 0 since there are no yellow marbles in the bag. This is defined formally as follows:
$$\begin{array}{*{20}l} {{\text{count}}\,{\text{bag}}\,X\,y = 0 } \hfill & {y \notin {\text{bag X}}} \hfill \\ {{\text{count}}\,{\text{bag}}\,X\,y = \left( {{\text{bag}}\,X} \right)(y)} \hfill & {y \in {\text{bag X}}} \hfill \\ \end{array}$$
An element y is in bag X if and only if y is in the domain of bag X:
$$y\,{\text{in}}\,{\text{bag}}\,X \Leftrightarrow y \in {\text{dom}}\left( {{\text{bag}}\,X} \right)$$
The union of two bags of marbles B1 = [b, b, b, g, r, r] and B2 = [b, g, r, y] is given by B1 ⊎ B2 = [b, b, b, b, g, g, r, r, r, y]. It is defined formally as follows:
$$\begin{array}{*{20}l} {({\text{B}}_{ 1} { \uplus }{\text{B}}_{ 2} )\left( y \right) = {\text{B}}_{ 2} \left( y \right)} \hfill & {y \notin {\text{dom}}\,{\text{B}}_{ 1} \wedge y \in {\text{dom}}\,{\text{B}}_{ 2} } \hfill \\ {({\text{B}}_{ 1} { \uplus }{\text{B}}_{ 2} )\left( y \right) = {\text{B}}_{ 1} \left( y \right)} \hfill & {y \in {\text{dom}}\,{\text{B}}_{ 1} \wedge y \notin {\text{dom}}\,{\text{B}}_{ 2} } \hfill \\ {({\text{B}}_{ 1} { \uplus }{\text{B}}_{ 2} )\left( y \right) = {\text{B}}_{ 1} \left( y \right) + {\text{B}}_{ 2} \left( y \right)} \hfill & {y \in {\text{dom}}\,{\text{B}}_{ 1} \wedge y \in {\text{dom}}\,{\text{B}}_{ 2} } \hfill \\ \end{array}$$
A bag may be used to record the number of occurrences of each product in a warehouse as part of an inventory system. It may model the number of items remaining for each product in a vending machine (Fig. 13.4).
A447511_1_En_13_Fig4_HTML.gif
Fig. 13.4
Specification of vending machine using bags
The operation of a vending machine would require other operations such as identifying the set of acceptable coins, checking that the customer has entered sufficient coins to cover the cost of the good, returning change to the customer and updating the quantity on hand of each good after a purchase. A detailed account is in [1].

13.7 Schemas and Schema Composition

The Z specification is presented in visually striking boxes called schemas. These are used for specifying states and state transitions, and they employ notation to represent the before and after state (e.g. s and s′ where s′ represents the after state of s). They group all relevant information that belongs to a state description.
There are a number of useful schema operations such as schema inclusion , schema composition and the use of propositional connectives to link schemas together. The ∆ convention indicates that the operation affects the state, whereas the Ξ convention indicates that the state is not affected. These conventions allow complex operations to be specified concisely and assist with the readability of the specification. Schema composition is analogous to relational composition and allows new schemas to be derived from existing schemas.
A schema name S1 may be included in the declaration part of another schema S2. The effect of the inclusion is that the declarations in S1 are now part of S2 and the predicates of S1 are S2 are joined together by conjunction. If the same variable is defined in both S1 and S2, then it must be of the same type in both.
A447511_1_En_13_Figf_HTML.gif
The result is that S2 includes the declarations and predicates of S1 (Fig. 13.5):
A447511_1_En_13_Fig5_HTML.gif
Fig. 13.5
Schema inclusion
Two schemas may be linked by propositional connectives such as S1 ∧ S2, S1 ∨ S2, S1 ⇒ S2, and S1 ⇔ S2. The schema S1 ∨ S2 is formed by merging the declaration parts of S1 and S2 and then combining their predicates by the logical ∨ operator. For example, S = S1 ∨ S2 yields as shown in Fig. 13.6.
A447511_1_En_13_Fig6_HTML.gif
Fig. 13.6
Merging schemas (S1 ∨ S2)
Schema inclusion and the linking of schemas use normalization to convert subtypes to maximal types, and predicates are employed to restrict the maximal type to the subtype. This involves replacing declarations of variables (e.g. u: 1.35 with u:Z, and adding the predicate u > 0 and u < 36 to the predicate part). The ∆ and Ξ conventions are used extensively, where the notation ∆TempMap is used in the specification of schemas that involve a change of state.
$$\Delta TempMap = TempMap \wedge TempMap^{\text{'}}$$
The longer form of ∆TempMap is written as follows:
A447511_1_En_13_Figg_HTML.gif
The notation Ξ TempMap is used in the specification of operations that do not involve a change to the state.
A447511_1_En_13_Figh_HTML.gif
Schema composition is analogous to relational composition, and it allows new specifications to be built from existing ones. It allows the after state variables of one schema to be related with the before variables of another schema. The composition of two schemas S and T (S; T) is described in detail in [1] and involves 4 steps (Table 13.1):
Table 13.1
Schema composition
Step
Procedure
1.
Rename all after state variables in S to something new: S [s +/s′]
2.
Rename all before state variables in T to the same new thing, i.e. T [s +/s]
3.
Form the conjunction of the two new schemas: S [s +/s′] ∧ T [s +/s]
4.
Hide the variable introduced in step 1 and 2. S; T = (S [s +/s′] ∧ T [s +/s])\(s +)
The example below should make schema composition clearer. Consider the composition of S and T where S and T are defined as follows:
A447511_1_En_13_Figi_HTML.gif
S1 and T1 represent the results of step 1 and step 2, with x′ renamed to x + in S, and x renamed to x + in T. Step 3 and step 4 yield as shown in Fig. 13.7.
A447511_1_En_13_Fig7_HTML.gif
Fig. 13.7
Schema composition
Schema composition is useful as it allows new specifications to be created from existing ones.

13.8 Reification and Decomposition

A Z specification involves defining the state of the system and then specifying the required operations. The Z specification language employs many constructs that are not part of conventional programming languages, and a Z specification is therefore not directly executable on a computer. A programmer implements the formal specification, and mathematical proof may be employed to prove that a program meets its specification.
Often, there is a need to write an intermediate specification that is between the original Z specification and the eventual program code. This intermediate specification is more algorithmic and uses less abstract data types than the Z specification. The intermediate specification needs to be correct with respect to the specification, and the program needs to be correct with respect to the intermediate specification. The intermediate specification is a refinement (reification) of the state of the specification, and the operations of the specification have been decomposed into those of the intermediate specification.
The representation of an abstract data type such as a set by a sequence is termed data reification , and data reification is concerned with the process of transforming an abstract data type into a concrete data type. The abstract and concrete data types are related by the retrieve function, and the retrieve function maps the concrete data type to the abstract data type. There are typically several possible concrete data types for a particular abstract data type (i.e. refinement is a relation), whereas there is one abstract data type for a concrete data type (i.e. retrieval is a function). For example, sets are often refined to unique sequences; however, more than one unique sequence can represent a set, whereas a unique sequence represents exactly one set.
The operations defined on the concrete data type are related to the operations defined on the abstract data type. That is, the commuting diagram property is required to hold (Fig. 13.8). That is, for an operation $$\boxdot$$ on the concrete data type to correctly model the operation $$\odot$$ on the abstract data type, the commuting diagram property must hold. That is, it is required to prove that:
A447511_1_En_13_Fig8_HTML.gif
Fig. 13.8
Refinement commuting diagram
$$ret (\sigma \boxdot \tau) = (ret \ \sigma) \odot (ret \ \tau)$$
In Z, the refinement and decomposition are done with schemas. It is required to prove that the concrete schema is a valid refinement of the abstract schema, and this gives rise to a number of proof obligations. It needs to be proved that the initial states correspond to one another; that each operation in the concrete schema is correct with respect to the operation in the abstract schema; and also that it is applicable (i.e. whenever the abstract operation may be performed, the concrete operation may also be performed).

13.9 Proof in Z

Mathematicians perform rigorous proof of theorems using technical and natural language. Logicians employ formal proofs to prove theorems using propositional and predicate calculus. Formal proofs generally involve a long chain of reasoning with every step of the proof justified. Rigorous proofs involve precise reasoning using a mixture of natural and mathematical language. Rigorous proofs [1] have been described as being analogous to high-level programming languages, with formal proofs analogous to machine language.
A mathematical proof includes natural language and mathematical symbols, and often many of the tedious details of the proof are omitted. Many proofs in formal methods such as Z are concerned with cross-checking on the details of the specification, or on the validity of the refinement step, or proofs that certain properties are satisfied by the specification. There are often many tedious lemmas to be proved, and tool support is essential as proof by hand often contains errors or jumps in reasoning. Machine proofs are lengthy and largely unreadable; however, they provide extra confidence as every step in the proof is justified. The proof of various properties about the programs increases confidence in its correctness.

13.10 Review Questions

  1. 1.
    Describe the main features of the Z specification language.
     
  2. 2.
    Explain the difference between ℙ1X, ℙX and FX.
     
  3. 3.
    Explain the three main parts of set comprehension in Z. Give examples.
     
  4. 4.
    Discuss the applications of Z. What problems have arisen?
     
  5. 5.
    Give examples to illustrate the use of domain and range restriction operators and domain and range anti-restriction operators with relations in Z.
     
  6. 6.
    Give examples to illustrate relational composition.
     
  7. 7.
    Explain the difference between a partial and total function, and give examples to illustrate function override.
     
  8. 8.
    Give examples to illustrate the various operations on sequences including concatenation, head, tail, map and reverse operations.
     
  9. 9.
    Give examples to illustrate the various operations on bags.
     
  10. 10.
    Discuss the nature of proof in Z and tools to support proof.
     
  11. 11.
    Explain the process of refining an abstract schema to a more concrete representation, the proof obligations and the commuting diagram property.
     

13.11 Summary

Z is a formal specification language that was developed in the early 1980s at Oxford University in England. It has been employed in both industry and academia, and it was used successfully on the IBM’s CICS project at Hursley. Its specifications are mathematical, and this allows properties to be proved about the specification, and any gaps or inconsistencies in the specification may be identified.
Z is a “and an explicit model” approach and an explicit model of the state of an abstract machine is given, and the operations are defined in terms of their effect on the state. Its main features include a mathematical notation that is similar to VDM and the schema calculus. The latter consists essentially of boxes that are used to describe operations and states.
The schemas are used as building blocks to form larger specifications, and they are a powerful means of decomposing a specification into smaller pieces. This helps with the readability of Z specifications, since each schema is small in size and self-contained.
Z is a highly expressive specification language, and it includes notation for sets, functions, relations, bags, sequences, predicate calculus, and schema calculus. Z specifications are not directly executable, as many of its data types and constructs are not part of modern programming languages. A programmer implements the formal specification, and mathematical proof may be employed to prove that a program meets its specification.
Often, there is a need to write an intermediate specification that is between the original Z specification and the eventual program code. This intermediate specification is more algorithmic and uses less abstract data types than the Z specification. The intermediate specification needs to be correct with respect to the specification, and the program needs to be correct with respect to the intermediate specification. The intermediate specification is a refinement (reification) of the state of the specification, and the operations of the specification have been decomposed into those of the intermediate specification.
Therefore, there is a need to refine the Z specification into a more concrete representation and prove that the refinement is valid. The refinement and decomposition are done with schemas, and it is required to prove that the concrete schema is a valid refinement of the abstract schema. This gives rise to a number of proof obligations, and it needs to be shown that each operation in the concrete schema is correct with respect to the operation in the abstract schema.
References
1.
A. Diller, Z. An Introduction to Formal Methods (John Wiley and Sons, England, 1990)
2.
G. O’Regan, Guide to Discrete Mathematics (Springer, 2016)
Footnotes
1
Stepwise refinement involves producing a sequence of increasingly more concrete specifications until eventually the executable code is produced. Each refinement step has associated proof obligations to prove that the refinement step is valid.
 
2
This project claimed a 9% increase in productivity attributed to the use of formal methods.