14 June, 2005

The traditional definition of "algorithm" is that it is a formally defined
procedure for performing some calculation. If a procedure is formally
defined, then it must be written down in some formal language, and such a
language, in which procedures for performing calculations can be written,
must be a *programming language*.

But when we talk about particular "algorithms", like the **Sieve of
Eratosthene** or **Quicksort**, we do not think of them as particular
programs written in a particular programming language. Rather we think of
them as *ideas* about how to write a program.

This point is made on the Tunes website in the page describing the The Sieve of Eratosthene. It is stated that:

you can write infinitely many programs based on this idea

Nevertheless, the Tunes project expresses the ambitious goal of devising a
programming language (**HLL**) sufficiently abstract that it can be used
to express any algorithm once and for all in a form that captures the precise
idea of that algorithm.

Unfortunately the authors of that page have not yet got around to actually writing the Sieve of Eratosthenes in HLL, leaving us to wonder if the goal is achievable.

Whether or not an algorithm is something that can be formally written down
may seem to be a mere point of academic discussion. But it has very practical
consequences when we want to achieve **software re-use**. The ideal of
re-use is that every bit of program only ever gets written once, thus saving
the endless rewriting that seems to fill 90% of the working hours of the
world's software developers.

Various computer science technologies have been proposed as solutions to the problem of re-use, including subroutines, object orientation and generic templates. And if any of these technologies was the solution, then it would have necessarily provided some means of writing an algorithm in a form so general and complete that one would never have to write code for that same algorithm again – all future implementations recognisable as being implementations of the "same" algorithm would be derived in some way from that one authoritative and definitive item of source code representing the algorithm.

But maybe there is no such technology, no such programming language, and no such way to once and for all perfectly express any particular algorithm as a single definitive chunk of source code.

The concept of algorithm may be more psychological than we want to admit. I've already mentioned that an algorithm is an idea about writing a program, rather than the program itself, and ideas are very slippery and subjective concepts. The fact that an idea is an idea about something as formal and precise as a computer program doesn't mean that the idea itself is formal and precise.

Despite the pessimism that the above implies about re-use, I believe that
there is one aspect of program code that is highly re-usable, and it's an
aspect that you wouldn't readily think of, because *it is left out of the
source code of almost software that is written*. This aspect is the proof
of the program's correctness.

The Sieve of Eratosthenes page on Tunes describes the "Mathematical Idea" of the Sieve algorithm by giving a series of informal definitions and lemmas. Presumably these definitions and lemmas (and their proofs) could be formally expressed in a mathematical language defined in an interactive theorem prover (such as Isabelle/HOL or Coq).

My proposition (and one I have not yet demonstrated directly, but which I feel could be so demonstrated), is that the proofs of the correctness of the many variations of the Sieve (as listed on the Tunes page) would successfully re-use the proof of correctness of any basic form of the algorithm.

This suggests that the best way to encapsulate the content of an "algorithm" is not to try and define it by a particular implementation in a particular programming language, but rather to define it by the mathematical definitions and theorems which are used to prove the correctness of an implementation (i.e. any implementation).

Even if we switch from program code to theorem proof, there still won't be
any *definitive* way to define an algorithm, because there are different
theorem proofs and different mathematical languages. But what we can achieve,
is that once a theorem proof has been performed for the correctness of one
particular implementation of the algorithm, those particular proof steps
*will never have to be performed again* (even though relevant additional
proof steps may be necessary to extend the proof to include variations on the
algorithm), and that will indeed be a limited but perfect form of re-use.