a blog about things that I've been thinking hard about

What is an Algorithm?

14 June, 2005
an algorithm is a concept with a proof of correctness

Two algorithms can be "the same", even though they have different steps.

The are the same because they express the same "idea" of how and why they solve some problem.

This "idea" can be formalised as a mathematical proof of the algorithm's correctness.

Algorithm = Calculation Procedure

The traditional definition of "algorithm" is that it is a formally defined procedure for performing some calculation. If a procedure is formally defined, then it must be written down in some formal language, and such a language, in which procedures for performing calculations can be written, must be a programming language.

But when we talk about particular "algorithms", like the Sieve of Eratosthene or Quicksort, we do not think of them as particular programs written in a particular programming language. Rather we think of them as ideas about how to write a program.

This point is made on the Tunes website in the page describing the The Sieve of Eratosthene. It is stated that:

you can write infinitely many programs based on this idea

Nevertheless, the Tunes project expresses the ambitious goal of devising a programming language (HLL) sufficiently abstract that it can be used to express any algorithm once and for all in a form that captures the precise idea of that algorithm.

Unfortunately the authors of that page have not yet got around to actually writing the Sieve of Eratosthenes in HLL, leaving us to wonder if the goal is achievable.

Re-use

Whether or not an algorithm is something that can be formally written down may seem to be a mere point of academic discussion. But it has very practical consequences when we want to achieve software re-use. The ideal of re-use is that every bit of program only ever gets written once, thus saving the endless rewriting that seems to fill 90% of the working hours of the world's software developers.

Various computer science technologies have been proposed as solutions to the problem of re-use, including subroutines, object orientation and generic templates. And if any of these technologies was the solution, then it would have necessarily provided some means of writing an algorithm in a form so general and complete that one would never have to write code for that same algorithm again – all future implementations recognisable as being implementations of the "same" algorithm would be derived in some way from that one authoritative and definitive item of source code representing the algorithm.

What is truly re-useable?

But maybe there is no such technology, no such programming language, and no such way to once and for all perfectly express any particular algorithm as a single definitive chunk of source code.

The concept of algorithm may be more psychological than we want to admit. I've already mentioned that an algorithm is an idea about writing a program, rather than the program itself, and ideas are very slippery and subjective concepts. The fact that an idea is an idea about something as formal and precise as a computer program doesn't mean that the idea itself is formal and precise.

Mathematical proofs are reusable

Despite the pessimism that the above implies about re-use, I believe that there is one aspect of program code that is highly re-usable, and it's an aspect that you wouldn't readily think of, because it is left out of the source code of almost software that is written. This aspect is the proof of the program's correctness.

The Sieve of Eratosthenes page on Tunes describes the "Mathematical Idea" of the Sieve algorithm by giving a series of informal definitions and lemmas. Presumably these definitions and lemmas (and their proofs) could be formally expressed in a mathematical language defined in an interactive theorem prover (such as Isabelle/HOL or Coq).

My proposition (and one I have not yet demonstrated directly, but which I feel could be so demonstrated), is that the proofs of the correctness of the many variations of the Sieve (as listed on the Tunes page) would successfully re-use the proof of correctness of any basic form of the algorithm.

This suggests that the best way to encapsulate the content of an "algorithm" is not to try and define it by a particular implementation in a particular programming language, but rather to define it by the mathematical definitions and theorems which are used to prove the correctness of an implementation (i.e. any implementation).

Even if we switch from program code to theorem proof, there still won't be any definitive way to define an algorithm, because there are different theorem proofs and different mathematical languages. But what we can achieve, is that once a theorem proof has been performed for the correctness of one particular implementation of the algorithm, those particular proof steps will never have to be performed again (even though relevant additional proof steps may be necessary to extend the proof to include variations on the algorithm), and that will indeed be a limited but perfect form of re-use.

Vote for or comment on this article on Reddit or Hacker News ...