Utilising the storage package, the GPO object model provides a persistent object model.

In order to be able to "persist" individual objects, a structure is needed that maps a persistent object ID to the storage reference used to access the persistent representation. And also to connect to any current in-memory representation.

Running Object Table

A running object table provides the structure that an Object Manager uses to access an in-memory object.

If the object is not in-memory then it requests the object from the persistent index structure before adding to the running object table.

The Persistent Index

The persistent index must provide efficient retrieval and update of the storage reference associated with a given object ID.

Again, scalability is of the utmost importance. As the number of objects grows, the index performance should not significantly degrade.

A tree structure is used. With each node of 64 bytes containing 16 entries. The ID directly provides the node path to follow, with an addressing convention signifying when a further node is to be traversed or the target reached.

An index structure of depth 6 would provide access to up to 16 million objects, and updating a single entry requires six modifed nodes to be updated.

A combination of weak reference linking and a most recently accessed ring cache is used to ensure that disk access is minimised.

In long lived transactions, modified index nodes are saved incrementally to ensure java memory resources are not strained. This has been tested in single transactions where 10 million objects have been updated with a java VM of 12Mb.

Object Retrieval

A key performance indicator is the rate at which objects can be retrieved from a backing store.

This depends on two main factors, the disk access - with the degree of file caching - and the de-serialization cost.

A recent benchmark using PDOM involved enumerating a large number of objects from the store, here are the results:

full recurse took 11897 millisecs for 119435 nodes
Elements : 107797 text : 11638 comments : 0

The text nodes were local objects held within element nodes so for our purposes it took 11897 milliseconds to retrieve 107797 addressable objects, or well over 9000 objects per second.