95Mar26

No Silver Bullet II

Ok, this still isn't a proof, but I think it is significantly tighter.

Let's take the "Software Crisis" to be a programming productivity problem, and the Silver Bullet to be the notion that we can bump programming productivity up by one to two orders of magnitude by increasing the re-use of software components.

I'll start by observing that any such Silver Bullet must be based on the combination of large numbers of small components:

So, we're talking about at least thousands, probably millions, of components, and freely recombining them by the thousand in novel ways in each new application, if we're talking about a Silver Bullet that will kill the Software Crisis.

Now let us add the observation that the Software Crisis is invariably taken to be one not just of quantity, but of quality: Delivering millions of lines of code does not seem to pose a problem per se, but rather the problem lies in delivering millions of lines of code that work. Thus a Silver Bullet will not only have to give us lots of components and ways to combine them, but it must give us ways to guarantee that they work.

To date, the only proposed Silver Bullet solution to this problem that I am aware of consists of documenting the interfaces of the components, and then later verifying that all mated components in completed structure satisfy all documented interface requirements. The only variability seems to be to lie in the extent to which this documentation and verification process is mechanized: Some proposals envision doing it all by hand, some entirely mechanically, most some mixture. I don't believe the mixture chosen affects the argument at hand, so we may ignore this.

Remembering again that the Silver Bullet is designed to kill the Software Crisis, and that the Software Crisis revolves around software to run airport luggage systems, cars, power plants and such, I think it is clear that the specification language used will have to be powerful enough to describe much of the real world, probably including differential equations and such.

If we are to derive consistency with the specification from the interface descriptions of the software components together with information on their connectivity, then the language used to document the interfaces must necessarily be approximately as powerful as the specification language, and in practice is likely to be identical to it.

Note, however, that problems as simple as SAT, the problem of determining whether a general set of two and three-variable boolean equations are consistent, is known to be NP-Complete, and hence almost certainly take exponential time to check in practice, whether done by hand or mechanically.

Thus, verification of the correctness of any large system built of software components must either

  1. Be done in a language simpler than that used by SAT, so as to guarantee polynomial (and in practice probably O(nlogn) or better) time to verify the entire system, or

  2. Take time exponential in the number of components.

I assert that neither will in fact give us a Silver Bullet. If the language is simple enough to permit verification in essentially linear time, then it will be too simple to describe the systems we need to build. On the other hand, if verification costs rise exponentially with the size of the system, then we have failed in our goal to make large software systems as cheap and easy to construct as small ones (to within a linear scale-up factor) and whatever incidental gains we might have introduced will be swamped in the end by the exponential verification ramp.

Note that the primary complaint about large systems is that testing takes forever on them. Testing being the contemporary verification technology in practice, ramp up in testing costs which is exponential in system size is exactly what we would expect given the above argument: The existence of the Software Crisis can be taken as supporting evidence.

Note also that the argument does not appear to depend on whether the components in question were acquired off-the-shelf or written specifically for the project, nor on any particular low-level details of how they interface: It appears to apply to virtually all current and proposed software technologies from Fortran through Haskell, OOP through advanced CASE, and including systems coded (say) in straightforward C. If this is correct, and the argument sound, we can anticipate that none of these are going to make construction of large, reliable software systems significantly cheaper: An exponential ramp quickly swamps any linear changes.


Addendum

It has been frequently noted that Nature, and particularly the brain, tends to work by "heuristic" rather than "algorithmic" techniques, coping with errors and inconsistencies rather than eliminating them.

This is usually taken as evidence of the limitations of dumb evolution in finding good designs, and of an opportunity for Homo sapiens sapiens -- Wise Wise Man -- to do Mom Nature one better.

If the above argument is correct, however, it may be that there is in fact no alternative to heuristic solutions when confronting large programming problems. The brain is working with structures on the order of something like 10^14 interconnected components, while we remain stuck playing with toy systems of ~ 10^6 transistors or lines of code.

It may well be that the exponential ramp in verification expense inherently limits algorithmic solutions to toy sizes in this range, and that heuristic designs are in fact the only practical way of building large software systems.

If so, verification complexity (which I've been focussing on for convenience, even though I expect similar arguments will in fact prove to hold for other parts of the programming process) may be the dominant design constraint on the construction of large software systems, good practical performance in the face of inevitable design faults may be the key design goal, and Gries & Co may have the right argument by the wrong end when they argue that we cannot understand how to write large programs until we understand how to write small parts of them: It may be that we cannot understand how to write small parts of large programs until we understand the design constraints and decisions of the larger system.

Bertrand Mayer has been known to argue strongly, in connection with his Eiffel programming language, that modules should not check correctness of input data, but rather depend on the interface contract between them and caller, and assume it correct. It may be that at the large scale this attitude is in fact demonstrably incorrect, that modules must be prepared to deal with periodic incorrect inputs, that the overall software system must continue to function sensibly, safely and usefully in the face of a continual low-level background of such faults, that the key enabling software technology for large software systems consists precisely learning of learning how to build in such fault tolerance, and that the Gries/Dijkstra/Hoare/... software-correctness technologies are at best specialized tools for optimizing small components of a software system and at worst a just plain irrelevant dead end ... and that "the hackers, the stupid hackers, will again win in the end" *giggle*.

[A riff on Larry Niven's The Magic Goes Away, substituting theoreticians for wizards and hackers for swordsmen.]

The CommonLisp condition system is the most sophisticated technology I've yet seen for signalling software faults, publishing available resources for dealing with them, and mediating human and software efforts to resolve the fault. It may be that this condition system has a lot more to tell us about the practical construction of large software systems than all the software correctness people put together. (Which might fill a large phonebooth? *grin*)