body { margin:0; padding:0; font-family:times; font-size:9pt; } p { margin-top:5; margin-bottom:5; font-weight:plain; font-family:serif; font-size:9pt; } td { margin-top:5; margin-bottom:5; font-weight:plain; font-family:serif; font-size:8pt; } h1 { margin-top:5; font-size:16pt; font-weight:bold; text-align:center; font-family:sans-serif; } h2 { margin-top:20; font-size:10pt; font-weight:bold; text-align:left; font-family:sans-serif; } h3 { margin-top:10; margin-bottom:0; font-size:9pt; font-weight:bold; color:#333333; text-align:left; font-family:sans-serif; } pre { color:#993333; font-size:9pt; } code { color:#993333; } .feature { padding:3; color:#773333; background:#FFFF77; font-weight:bold; font-size:9pt; text-decoration:underline; } .section { text-align:justify; padding:5; }
Why is it hard to store and access data? Why are speadsheets useful? How come much of what I do seems semi-mechanical?
These are some of the questions that have lead to CTC development.
This document introduces four key CTC technologies:
Its brief content provides context and justification for each technology and attempts to convey an impression of its functionality and potential. It is aimed at a wide audience, from project managers to analysts, designers and active developers.
After over twenty years commercial software development, it was apparent that too much time is spent solving the wrong problems - using inappropriate technology.
Rather than simply complaining about how complicated systems were becoming it was time to do something about it. To "cut the crap" and develop technologies that solve the real problems system designers face - and not to create new ones.
Built on Java
All the technology has been developed using the java language. Java was
probably the first commercially acceptable dynamic object-oriented lanaguage and it has proved extremely stable over the
years I have used it.
I continue to be impressed with java performance.
Why not C#?
C# is Microsoft's version of Java and in some areas it appears to be an improvement. But there cannot be the same level of confidence, based on past experience, that the technology will remain stable. And of course, any implementation will be restricted to a Microsoft platform.
A very large proportion (> 90%) of most systems is concerned with data transformation. Not just retrieving and storing to some storage system like a database, but also to prepare data for use in a GUI or web browser, or converting data to and from web services. None of this is to do with helping solve the application problem.
Because of the amount of data transformation needed, there is always a trade-off between the suitability of the model to solve the users problems and the complexity of the tasks to manipulate it.
What's wrong with relational databases?
Nothing, but they are not the answer to everything.
Probably the biggest data system around today is managed by Google. It should come as no surprise that Google does not use a relational database to hold all its data. What was the primary focus to maximise Google performance? To minimise disk access - the one area that has hardly improved in the last thirty years. That's why Google takes less time to search and find a phrase in a single document (from the thousands of millions indexed), than your local supermarket system will need to compute the price of a two-for-one offer stored in its industry compliant relational database.
What other applications do you frequently use? Language Compilers? Word processors? Spreadsheets? Presentation software? None of these applications use conventional relational databases.
So there is good reason to consider alternatives to conventional database technology when developing new systems.
Designers should be free to concentrate on what models they wish to build to represent the "real world". Too often, system designers are over concerned with how the model will have to be stored.
The point of a "persistence" system is not only to provide a mechanism to allow the storage and retrieval of the data (to "persist" it), but also to simplify these tasks to allow the designer/programmer to concentrate on using the model in their application.
Such a system also allows the designer to be able to concentrate on developing the best model.
The Generic Persistent Object model provides such a solution.
GPO is not only a mechanism to "persist" (store) data, but also a powerful
model that directly supports the creation and management of object associations (the reference
of one object to another or set of objects).
In GPO object navigation provides access to object properties and
associated objects. The designer/programmer is not concerned with how data is stored and retrieved from the
underlying persistent store. They simply access and update objects directly.
The main reason that this is a problem with conventional databases is that they
have no "in-built" awareness of data identity. It may be possible to define
FOREIGN-KEY constraints and to arrange for such keys to be indexed to
provide lookup support, but these are annotations of the basic structures.
In the Generic Persistent Object model, the structures that provide the object association information are the same as used to "index" the set and to define "dependencies" that support referential integrity.
It is simply not possible in GPO to reference an object that no longer
exists.
No other object persistence mechanism supports such scalable and maintainable object association mechanisms.
In addition to common programming approaches such as procedural or object-oriented styles, spreadsheet users are aware that they are also able to build sophisticated data models.
A recent addition to the GPO system has been the ability to define
formulas for a specific object property.
GPO formulas are more functional than the computed values of
many database products and have more in common with a spreadsheet approach. For example:
gpo.define("label", "(-> type name)");
defines a "label" property computed to be the "name" of the associated "type" object. If the "type" object is changed, or the "name" property of the current "type" object, then the formula is re-calculated.
This approach minimises re-computation and allows computed values to be indexed and used in lookups.
A number of useful formula primitives are provided and a simple mechanism exists for new formulas to be defined.
No other persistence system or database provides similar functionality.
There are two standard persistent stores. An updatable store that frees and reallocates storage, and a "Write-Once" store that never reallocates previously freed storage.
Both stores will "preserve session data" and not overwrite storage freed in the current session. This ensures that maximum transaction isolation can be provided for long-lived transactions.
The "Write-Once" or WORM store goes a step further and never
overwrites freed storage. When some storage is updated, the new version is written to
currently unallocated store and a reference is written that allows the new version to
access the previous version.
Furthermore, when a set of system updates is "committed", data is saved that allows the system to retrieve the "previously committed" state. The date/time of each "commit" is also saved, allowing straightforward access of system state from any given time in the past.
When linked with the GPO model this enables the easy retrieval and navigation
of historical systems.
No other storage system provides such simple, efficient, direct access to historical data.
These days all programmers will probably have used "code-generation" in one for or another.
From "wizards" that provide "wrapper" classes for the latest .Net protocols, to "entity" generators
that generate classes to retrieve data from some SQL database.
But most of the things that code generators are used for are not needed for GPO.
The whole point of GPO is to provide a natural, simple and highly functional
model that designers and programmers can use directly.
However, experience of using GPO in a number of commercial developments has
led to the identification of several patterns of use that appeared at least semi-mechanical
in their application. These patterns cover such things as system initialization, object
lifetime management, property validation and object lookup.
This has resulted in the development of the "Alchemist" system generator. The Alchemist utilizes a simple model definition to generate a complete object model. This includes java source code, documentation and a deployable web application.
The Alchemist model definition is the "MetaData" of the resulting model - just as a database schema is the metadata of a database.
When the Alchemist generates the java code that implements the model, it also generates support that allows the objects to provide access to the metadata that defined them. Thus the persistent objects within the generated model are self-describing.
The Web Application that is built by the Alchemist, utilizes the "self-describing" properties to provide a generic, but customizable application.
To demonstrate this, the Alchemist has generated its own web application that is used to incrementally and interactively define and test new models, before final generation and deployment.
No other generation system generates a self-describing system.
For more information on CTC software, visit the main website: www.cutthecrap.biz where you will find a number of articles. To speed that process up, here is a selection of links:
| The Generic Persistent Object Model | Provides an overview of the Generic Perisistent Object Model. |
| Referential Integrity: A Problem Solved | Discusses the problem of referential integrity and how the Generic Persistent Object Model solves the problem. |
| Incidental Persistance | The generic model is something more than simply persistent. |
| Unification of Spreadsheets with a Persistent Object Model | Discusses the approach taken to unify dataflow/spreadsheet type programming with the Generic Persistent Object Model. |
| Model-based RAD with the Alchemist | Explains the Alchemist approach to system and application generation. |
r-value Data in an l-value World |
Presents a different take on system design. |
| The Odyssey | Provides a concise tutorial introduction to the CTC software. |
| The Alchemist Web Application | Guides user from installation to generating a new Object model |
| Building A GPO Based Data Model | Using a simple example demonstrates how specializing GPO classes
can be used to implement a data model. |
The full distribution can be freely downloaded from the main download page.