When a system is designed and implemented there are a range of low level concepts that the designer starts with. These are the design paradigms and programming idioms that have been developed over the last fifty years. This article briefly describes mainstream idioms and paradigms with the objective of demystifying the "received wisdom" that screws so many systems.
In the begining there was the instruction. And the instruction was good, and the processor saw that it was good.
....and that was that!
But the instruction that moves data from memory to register, or register to register, or register to memory. Or even the one that added one register to another... This was not sufficient.
Other instructions are needed. Instructions that allow comparison between two registers, setting a special status code register, and instructions that set a special instruction pointer register dependent on status codes.
Each processor defined a new set of instructions - an instruction set - that it promised to understand.
Machine code is the numeric encoding of the machine instructions. This is the raw data that a computer central processing unit - CPU - "understands" (or rather "is designed to process"). Originally people would write programs directly in machine code for loading into a computer.
Machine code does have one special property - everything is data. The programmer knows that they simply produce data.
The mnemonic assembler is a much more accessible way to write machine code.
Rather than learn numeric patterns that must be generated an assembler program processes the assembler file and generates the machine code. There is a one-to-one mapping from assembler command to machine code.
At this early stage, the programmer has already lost the direct awareness that the program is just data - since they are not generating that data themselves.
This is a variant on the standard assembler, where templates can be defined to generate well specified patterns of machine code.
After macro assemblers, problem centred programming languages like FORTRAN and COBOL (of course in UPPERCASE) were followed by more general purpose languages such as ALGOL, PASCAL and C.
At the same time another school of computer scientists were developing the first symbolic programming language - LISP - but that is another tale.
So what fuelled the development of computer languages?
Surely a mnemonic assembler should be sufficient to effectively communicate with the computer?
The inspiration for computer language development were the discovery and use of programming idioms.
An idiom is a pattern that provides some useful functionality. This is the initial realisation that leads to abstractions.
Since computer programs primarily examine and manipulate data structures, one of the simplest and most useful structure is an array, or list of data.
All programming languages provide support for looping, or iterating constructs that can be conveniently used to process a list of data elements. Hence :
for i from 0 to n...
for (int i = 0; i < n; i++) {...
while (i < n) {...
do {...} while(...)
do {...} until(...)
These kind of constructs should be familiar to any programmer. The idiom is the same, just the particular syntax may vary between languages.
As computer programs began to grow in size so ideas of how to structure larger programs were developed
Programmers quickly discovered that it was useful to be able to arrange code in Blocks or subroutines. Conventions were needed to allow the program flow to move from one block to another and back again.
Along with the idea of the subroutine, came the requirement for passing data from the calling routine to the called routine. Initally this would be done via global data areas that would be set by the caller, or perhaps setting some predefined registers. Early programming languages provided idiomatic support for this kind of parameter passing.
As experience of subroutines and parameter passing grew, so better idioms were developed to provide more general support. Concepts such as program stacks and application heaps were introduced.
These new concepts allowed idiomatic call conventions to pass data in a more generic way, and specifically allowed for recursive calls using a special register used as a stack pointer that grew and shrunk the stack as the routines were entered and exited. Some languages used the stack to also store local variables for a routine, whilst others used a different area - the application heap.
Note that the introduction of idioms to conveniently support single return values now created problems for functions to provide multiple return values - a concept that system designers were now blind to.
A technique developed early on was to store routine, or code block address within some data structure. Some computation could then result in accessing the data structure and transferring control to the routine indicated.
What do people have in their heads when they develop an idiom? They are trying to achieve some computational pattern, sure, but why are they trying to do this?
The rationalisation of why and how to use different idioms leads to the development of programming paradigms. Such paradigms may then be further encouraged by specific programming constructs in some languages. The key to a successful paradigm is its generality and scalability. A paradigm has to be something that can be applied pervasively, at all levels, otherwise it is simply a technique.
The procedural programming paradigm is really the only paradigm that is understood by most programmers today. And even then I'm not sure how well.
Procedural programming tends to be associated with the design ideas of modularisation and functional decomposition. The objective of procedural programming is clarity and simplicity.
The goal of modularisation indicates that the program should be structured in such a way that a modular procedure can be easily re-used by other procedures.
The goal of functional decomposition should mean that long complex procedures can be split - where appropriate - into several simpler procedures/functions.
It should be clear that the twin goals of modularisation and functional decomposition are symbiotic.
Rule of thumb guidelines like "no procedure should be longer than twenty lines" miss the point entirely. The objective is clarity. By all means ask the question - "Could this procedure be written better?" .. "Is there a good case for introducing a sub-routine?" - but always understand what benefit is achieved first.
Functional decomposition can also be an excuse for a condition I have called "code fright". It is I believe a very real problem - where a programmer can never bring themselves to solve a possibly complex problem, so they keep on introducing more and more sub-routines, trying to put off the time when they will have to write the real code.
The Object-Oriented or OO paradigm is probably the one most worked on and analysed over the past thirty years - that's right, it was not developed by Microsoft in 1993!
The essential idea is that functions should be associated with data. And that this data should - must - have identity.
This recognises that when a procedure is called there is always an objective to modify some data - or create some other side-effect - otherwise there is no point to the procedure.
It is therefore possible to do OO programming with a procedural programming language, simply always write each procedure to take as a first argument the address of some data structure. Only elements in this data structure should be modified by the procedure, whilst other parameters can be used to parameterize the behaviour.
When you start to organise a procedural program along these lines, you soon discover that procedures can be grouped according to the type of data structure they are intended to apply to - or you will if you correctly follow functional and modular decompositions objectives!
A language such as C++, simply takes this empirical structure and provide some syntactical and semantic support.
Ideas of polymorphism and class inheritance etc.. developed from associating this simple data centric view of a computer program with the jump table idiom. Here, any "object", is a data structure that has as an element a reference to some other data structure that serves as a function jump table.
There are various modified idioms that provide different levels of indirection and slightly different behaviours in different languages, but there is no magik!
This is the bit that few programmers get.
The point is to use the OO idiomatic structures to help design a system with "good" abstractions. It is a very good approach to attempt to minimise blindness, and is very powerful. All too often tho' it is instead an obstacle. Supposedly OO designs simply provide a further barrier by encoding abstractions that are more to do with an underlying procedural design than the problem at hand.
The accepted view of OO is that it was developed at Xerox PARC in the 70's and 80's primarily from the Smalltalk language. - in fact a parallel effort at MIT developed the ACT language that has unfortunately not had the recognition it deserves.
The PARC team emphasized specialization and generalization as the OO design method.
This is an iterative and incremental approach - Xtreme Programming a new thing? Don't make me laugh!
Many programmers believe they understand what these terms mean. I refute that, I believe very few really know what is meant, since I have seen very little evidence of it being done.
Specialization is a pretty straightforward idea - you get some object with some behaviour and you create a new specialized object that refines the behaviour in some way.
Generalization tho' is something other than stating that a superclass is a generalization of one of its subclasses. As a result of some specific specialization, the designer/programmer is aware that things are not quite right.
A simple generalization step may be the addition of some more general protocol to be handled by all subclasses, but it should often be much more than that.
The process of generalization also includes the recognition that the class hierarchy should be reorganised - that the existing structure is in some sense inappropriate. Akin to discarding a scientific theory rather than simply refining it.
This is rarely done since, a) programmers do not recognize that it is needed, or b) project managers cannot be persuaded that it means anything. But the lack of generalization is frequently the cause of failed or expensively overrun projects and the perception that OO doesn't yield any benefits.