Referential Integrity: A Problem Solved

Author: Martyn Cutcher

email martyn@cutthecrap.biz

website www.cutthecrap.biz

What's The Problem?

Referential Integrity is a common problem in computing. Phrased slightly differently it is the problem of maintaining "the integrity of references".

It is most often considered with regard to databases, and specifically relational databases.

Consider two tables:

Table One
IDDataReference
12Some data34
13More data43
Table Two
IDData
34even more data
35unreachable data

The first point here is a pretty deep one, and that is that at the simplest level there is NO problem. The database stores the data and that is it.

Unbeknownst to the database however, an "interpretation" of the data requires that the Reference column in Table One matches against a value in the ID column in Table Two.

Unfortunately, although the first row in Table One seems okay, the second row references ID "43" which does not exist. This is a problem of Referential Integrity.

The Database "Solution"

To maintain referential integrity you can define "integrity constraints" against a table. These will be checked when data is committed. Additionally you may define triggers that will perform additional operations.

In my opinion, these are not solutions to the problem of referential integrity, simply techniques that can be used to attempt to manage such problems.

As data structures become more complex, so the task of ensuring integrity will increase in complexity at least at the same rate.

The real problem is that the database does not know (or care) that column values in one table should match column values in another.

This is complicated further by the lack of restrictions on database schemas. You are pretty much free to define whatever model you like, and more relevantly any kind of referencing mechanism.

The Generic Persistent Object Solution

In a GPO model, objects reference each other directly. Of course, there is no magik, and somewhere this is stored as a data reference. But here is the point: the GPO system "knows" that it is an object's reference - and really does "care" about it!

The internal GPO structures allow an object to track references to itself, so that if it is removed, for example, it is able to clear any references.

In GPO it is simply not possible for there to be a problem of "referential integrity".

The Double Win

The genuinely exciting thing about GPO - well it excites me anyhow - is that the structures that support the maintenence of "referential integrity" are the same structures that support the modelling of object associations, and the same structures that effectively "index" object sets.

When things come together so sweetly, you know you have solved the problem.

What is this "GPO" then?

If you are really interested, then the best thing is to check out the main Cut The Crap Website where you will be able to find more information and download the software and follow a tutorial introduction.

If you have any comments/questions on this paper or the technology, please email the author.