Software Specification: Algebraic Datatypes, Lemmas and Calculations

Software Specification, MEIC @ IST, 2020/21

João F. Ferreira

 @jff    https://joaoff.com
### At the end of this session, you should be able to: - Use arrays, quantifiers, and predicates in Dafny - Verify simple algorithms using Dafny - Define algebraic inductive datatypes - Verify code using inductive arguments - Write code to prove lemmas
## Last session, we discussed: - Program termination: notions of _well-founded domain_ and _variant_; _decreases_ annotation - Arrays and quantifiers in Dafny

Arrays

  • Arrays are a built-in part of Dafny. Their type is array<T>, where T is another type.
  • They have similarities with arrays in mainstream languages such as Java:
    • Arrays can be null, and have a built-in length field, a.Length
    • Element access uses the standard bracket syntax and are indexed from zero: a[0], a[1], ...
  • In Dafny, all array accesses must be proven to be within bounds. Because bounds checks are proven at verification time, no runtime checks need to be made.

Arrays

  • Example: Searching an array for a particular key, returning the index of a place where we can find the key if it exists.
  • We have two outcomes for a search:
    • If the algorithm returns an index (i.e. non-negative integer), then the key should be present at that index.
    • If the key is not in the array, then we would like the method to return a negative number. In this case, we want to say that the method did not miss an occurrence of the key; in other words, that the key is not in the array.

Exercise: Specify the first requirement of Find: if the algorithm returns an index (i.e. non-negative integer), then the key should be present at that index.

Use the following signature for your method:

method Find(a: array<int>, key: int) returns (index: int)

Exercise: Implement Find in a way that satisfies your specification.

Arrays

  • To satisfy the first requirement:
    method Find(a: array<int>, key: int) returns (index: int)
    ensures 0 <= index ==> index < a.Length && a[index] == key
    {
       // ...
    }
  • To satisfy the second requirement, we need to introduce quantifiers

Quantifiers

  • A quantifier in Dafny most often takes the form of a forall expression, also called a universal quantifier. Example:
    forall k :: k < k + 1
  • We often quantify over the elements of an array or data structure:
    forall k :: 0 <= k < a.length ==>  0 <= a[k]
  • We can also write existential quantifications:
    exists k :: 0 <= k < a.length ==>  0 <= a[k]

Exercise: Use quantifiers to specify that the method Find returns a negative number whenever the searched key is not present in the array.

method Find(a: array<int>, key: int) returns (index: int)
   ensures 0 <= index ==> index < a.Length && a[index] == key
{
   // ...
}

Binary Search

  • A linear search is not very efficient, especially when many queries are made of the same data. If the array is sorted, then we can use the very efficient binary search procedure to find the key.
  • But in order for us to be able to prove our implementation correct, we need some way to require that the input array actually is sorted.
  • We could do this directly with a quantifier inside a requires clause of our method, but a more modular way to express this is through a predicate.

Predicates

  • A predicate is a function which returns a boolean
  • For example, we define the sorted predicate over arrays of integers as a function that takes an array as an argument, and returns true if and only if that array is sorted in increasing order:
    predicate sorted(a: array<int>)
       requires a != null
    {
       forall j, k :: 0 <= j < k < a.Length ==> a[j] <= a[k]
    }

    Exercise

    Write this predicate in Dafny. Try to explain the error message that appears.

Predicates

  • Dafny rejects the previous predicate, claiming that the predicate cannot read a.
  • The sorted predicate is not able to access the array because the array was not included in the function's reading frame.
    • The reading frame of a function (or predicate) is all the memory locations that the function is allowed to read.
  • The reason we might limit what a function can read is so that when we write to memory, we can be sure that functions that did not read that part of memory have the same value they did before. Framing is essential to making the verification process feasible.

reads annotation

predicate sorted(a: array<int>)
   requires a != null
   reads a
{
   forall j, k :: 
        0 <= j < k < a.Length ==> 
        a[j] <= a[k]
}

Modifies Annotation

  • Methods are allowed to read whatever memory they like, but they are required to list which parts of memory they modify, with a modifies annotation. Example:
     method m3(a: array<int>)
        requires 1 <= a.Length
        modifies a
    {
        a[0] := 42;
    }
  • Reads and modifies are one of the tools that allow Dafny to work on one method at a time, because they restrict what would otherwise be arbitrary modifications of memory to something that Dafny can reason about.

Exercise: Write a verified implementation of binary search. Use the following signature:

method BinarySearch(a: array<int>, key: int) 
                                        returns (index: int)
    requires a != null && sorted(a)
    ensures ...
{
	...
}

Algebraic Inductive Datatypes

  • An algebraic data type is a structured type that is formed by composing other types. For example:
    datatype Colour = Red | White | Blue
  • A more interesting example is:
    datatype Nat = Zero | Succ(n: Nat)
  • The values of inductive datatypes can be seen as finite trees where the leaves are values of basic types, numeric types, etc.
    • Zero, Succ(Zero), Succ(Succ(Zero)), etc.

Algebraic Inductive Datatypes

  • Values of inductive datatypes can be compared using Dafny’s well-founded < ordering.
    • Example: Zero < Succ(Zero) < Succ(Succ(Zero))
  • Values of algebraic datatypes can be analysed using pattern matching:
    function add(n1: Nat, n2: Nat): Nat 
    {
        match n1
            case Zero    => n2
            case Succ(n) => Succ(add(n, n2))
    }
    

Lemmas and Calculations

  • We may express properties of our programs using lemmas. Lemmas are implicitly ghost methods (i.e. methods that are not compiled into the executable).
    lemma add_unit(n: Nat)
        ensures add(n, Zero) == n
        ensures add(Zero, n) == n
    {} 
    
    lemma add_succ(n1: Nat, n2: Nat)
        ensures Succ(add(n1,n2)) == add(n1, Succ(n2))
    {}
    
  • Note that if we don't include the {}, then it's an axiom (i.e., we assume it to be true).

Lemmas and Calculations

  • The previous lemmas are proved automatically by Dafny. However, there are cases where Dafny needs our assistance. For example:
    lemma add_commutative(n1: Nat, n2: Nat)
        ensures add(n1, n2) == add(n2, n1)
    {}
    

Induction over Natural Numbers

  • Recall the principle of induction over natural numbers: If we want to show that $P$ holds for all numbers $n$, we:
    1. show that $P(0)$ holds;
    2. show that, for any $n'$, if $P(n')$ holds, then so does $P(Succ(n'))$;

Lemmas and Calculations

  • We prove the lemma using induction on the first argument:
lemma {:induction n1} add_commutative(n1: Nat, n2: Nat)
    ensures add(n1, n2) == add(n2, n1)
{
    match n1
        case Zero =>
            calc {
                add(Zero, n2);
                == // def. add
                n2;
                == { add_unit(n2); }
                add(n2, Zero);
            }
        case Succ(n) => // Inductive hypothesis: add(n, n2) == add(n2, n)
}

Lemmas and Calculations

  • Inductive step:
case Succ(n) => // Inductive hypothesis: add(n, n2) == add(n2, n)
            calc {
                add(n1, n2);
                == // n1 == Succ(n)
                add(Succ(n), n2);
                == // def. add
                Succ(add(n,n2));
                == // IH: add(n, n2) == add(n2, n)
                Succ(add(n2,n));
                == { add_succ(n2, n); }
                add(n2, Succ(n));
                == // Succ(n) == n1
                add(n2, n1);
            }

Structural Induction

  • To prove that $P(x)$ holds for all values $x$ of a recursively-defined datatype $T$, we prove:
    1. $P(b)$ holds for each base case $b \in T$
    2. $P(c(x))$ for each constructor, $c$, assuming induction hypothesis $P(x)$

Structural Induction (Lists)

datatype List = Nil | Cons (head: T, tail: List)
  • To prove that $P(x)$ holds for all values $x$ of the type List, we prove:
    1. $P(Nil)$
    2. $P(Cons(h,x))$ assuming induction hypothesis $P(x)$ (and for arbitrary $h$)

Structural Induction (Lists)

Example (from this week's lab):
lemma reverse_append<T>(l1: List<T>, l2: List<T>)
  ensures reverse(append(l1, l2)) == 
          append(reverse(l2), reverse(l1))
  1. First, prove the property for the base case Nil          
    reverse(append(Nil, l2)) == append(reverse(l2), reverse(Nil))
  2. Second, prove property for Cons(h,x)
    reverse(append(Cons(h,x), l2)) == 
    append(reverse(l2), reverse(Cons(h,x)))
  3. assuming that the property is valid for x
    reverse(append(x, l2)) == append(reverse(l2), reverse(x))

Exercise

Use Dafny to prove this lemma. (We'll do it in the lab.)

Lemmas

  • We have been writing lemmas about functional code and we have been writing calculations to provide manual proofs
  • But lemmas can be used with imperative code, too.
  • And our proofs can be written using annotated code!
  • To illustrate this, let's consider a very simple searching algorithm: finding zeros in a specific class of arrays.

Example: Searching for Zero

  • We will consider a somewhat contrived example: searching for zero in an array. What makes this problem interesting is that the array we are searching in has two special properties:
    1. all elements are non-negative, and
    2. each successive element decreases by at most one from the previous element
  • As before, our program should return an index of where zero occurs; a negative index if there are no zeros.

Exercise

Let's implement and verify this example in Dafny.
### After this session, you should now be able to: - Use arrays, quantifiers, and predicates in Dafny - Verify simple algorithms using Dafny - Define algebraic inductive datatypes - Verify code using inductive arguments - Write code to prove lemmas

Credits

  • These slides use the framework reveal.js and were produced using vim and Ubuntu.
  • Some slides and examples are based on material available online by Rustan Leino. Some material is taken from Dafny's tutorial.