The Generic Persistent Object Model

Author: Martyn Cutcher

email martyn@cutthecrap.biz

website www.cutthecrap.biz

The Generic Persistent Object Model (from now referred to as GPO) is the basis for most of the Cut The Crap software.

This whitepaper attempts to provide a clear description of the technology.

What is an Object?

The fundamental property, before we get into anything else, is identity.

Each separate object has its own unique identity. This has nothing to do with similarity. The real world is populated by an uncountable number of unique objects, from people to shoes to trees to leaves to molecules and atoms.

Each is unique. To model the real world - or indeed virtual elements of it, identity is key.

Object Creation

To create an object therefore, some environment is needed to "afford" it identity.

In the GPO system, a special object exists called the ObjectManager.

When a GPO system is initialized, the ObjectManager for that system is retrieved. The ObjectManager is GOD in GPO.

client = new OMClient();

om = client.getObjectManager();

When a new GPO object is created - a reference to the ObjectManager must be provided, the initialization of that object includes the provision of a unique identity by the ObjectManager.

gpm = new GPOMap(om);

The identity of a created object can be retrieved using the getID() method, and the basis of persistent access by the ObjectManager can be shown as:

gpm == om.getObject(gpm.getID());

Object Properties

Creating new objects may provide a mild thrill for some, but it is unlikely to catch on as something useful of itself.

The objects created need to have some other data in addition to their identity. At the simplest level these are simple properties.

gpm.set("name", "My First Object");

The base GPOMap object allows standard java objects to be set as values to properties defined by key strings.

System.out.println(gpm.get("name"));

Will print the value of the "name" property of the gpm object.

Object Associations

A property becomes slightly more interesting when it refers to another persistent object:

gpm2 = new GPOMap(om);
gpm2.set("name", "Another Object");

gpm2.set("parent", gpm);

gpm2.get("parent").get("name");

The example above shows the creation of a second object and an association with the first object by setting a new "parent" property, and then method chaining demonstrating object navigation.

This is pretty much what we would expect from any system, however, when a reference is made from one object to another, a link structure is defined that allows reverse navigation:

ls = gpm.getLinkSet("parent");

The ls object is a LinkSet that contains all the objects that reference gpm using the "parent" property.

gpm3 = new GPOMap(om);
gpm3.set("parent", gpm);

ls.size(); // returns 2

The link structure maintained by GPO that enables this does not utilize any vector or array structures, everything is managed with a number of double-link lists, thus a one-to-many association of perhaps millions of objects can be managed with low resource overhead.

Many-to-Many

To represent a many-to-many association it is "simply" necessary to introduce an intermediate object. This is analogous to a relational join table.

A utility object class ManyManyLink is provided to manage these links. A feature of GPO is an option to store one object along with another. The ManyManyLink objects dynamically choose which of the two objects to be stored with - based on the relative number of ManyManyLinks currently stored with each. In this way the system is self balancing and the resulting performance of many-to-many navigation is of the same order as one-to-many.

For example, suppose there are four "child" objects and each shares the same two "parents", this could be represented as:

p1 = new GPOMap(om);
p2 = new GPOMap(om);
c1 = new GPOMap(om);
c2 = new GPOMap(om);
c3 = new GPOMap(om);
c4 = new GPOMap(om);
new ManyManyLink("parent", p1, "child", c1);
new ManyManyLink("parent", p2, "child", c1);
new ManyManyLink("parent", p1, "child", c2);
new ManyManyLink("parent", p2, "child", c2);
new ManyManyLink("parent", p1, "child", c3);
new ManyManyLink("parent", p2, "child", c3);
new ManyManyLink("parent", p1, "child", c4);
new ManyManyLink("parent", p2, "child", c4);

A utility protocol is available on IGPOMap that may be preferred:

c1.addManyMany("child", "parent", p1);

Which is the same as:

p1.addManyMany("parent", "child", c1);

When iterating a set object objects, a ManyManyLink will automatically "resolve" the reference to the intended target, so that, for example:

children = p1.getLinkSet("parent").iterator();

will iterate the c1-c4 and not the intermediate ManyManyLink objects.

Remembering Objects

That it is possible to retrieve a persistent object should its identity be known is not sufficient. This would require some external configuration to specify certain object identities.

Instead, internally, the ObjectManager creates - and itself remembers the identity of - a specific GPOMap object that it uses to store "global" references.

The ObjectManager provides methods to remember and recall values.

Using these methods specific objects can be remembered against provided keys:

om.remember("root", gpm);
gpm == om.recall("root");

When the system is shutdown and subsequently restarted, the ObjectManager will recall values previously remembered.

Object Lookup

The system as described so far allows the definition of arbitrary object properties and provides efficient and scalable navigation of general object associations.

When using such system this is often sufficient, but frequently it is a requirement to be able to access some specific object within some set of objects. GPO supports this with objects called Classifiers.

A Classifier effectively indexes a set of objects - as defined by a one-to-many or many-to-many - with some specified property.

The ObjectManager enables this via a registerClassifier method:

om.registerClassifier("parent", "name");

This ensures that when the set implied by the assigning of an object property is created, a Classifier object is created to manage the index structures.

Note that a set may be classified on more than one property.

The Classifier object can be retrieved from the LinkSet object and provides a number of methods to access objects in the set.

The obvious method is getValue(Object propertyValue) but other methods provide support for other processing, such as range iteration, counts and index access.

The direct access to the Classifier objects is a novel feature of GPO and supports functionality that in other systems requires the maintenance of other parallel structures that would complicate the application model.

For example, suppose we would like to create some kind of hierarchical access to the contents of a large set. Let's suppose that we have a set of one million objects. I hope you would agree that choosing an object from an ordered list is not a good solution.

Let's suppose that we have classified a "name" property. Using the Classifier we could create some alphabetical selection interface, so the user might select the objects beginning with the letter 'M'. They might then choose the object between "Mab-Mef", and so on.

Within a few clicks they could have found the object they wanted - "Matchsticks" for example.

And this would have been processed very efficiently, because the Classifier can provide direct access to methods such as getRangeCount(startKey, endKey), getKeyAtIndex(index) and getIndexedObject(index).

Summary

This may appear to be a pretty short paper. But that is really the point, GPO is not really a complex idea.

Objects have identity, state and they persist.

Objects can have associations with other objects.

Sets defined by associations themselves have identity. These sets can be "classified" and a "Classifier" object retrieved with which to efficiently lookup values in the set.

That's it!