What Is A Striterator?

Author: Martyn Cutcher

email martyn@cutthecrap.biz

website www.cutthecrap.biz

Well .. What is it?

A Striterator implements the standard Iterator and Enumeration interfaces while providing functionality normally associated with stream processing.

How An Iterator Is Like A Stream

Unlike a Set or List object, an Iterator delivers one object at a time. The next method is directly analagous to the readObject method of an ObjectInputStream. Consider this example code:

ArrayList list;

FileOutputStream outfile = new FileOutputStream("tempFile.buf");
ObjectOutputStream objout = new ObjectOutputStream(outfile);

Iterator objs = list.iterator();
while (objs.hasNext()) {
  objout.writeObject(objs.next());
}

objout.close();
outfile.close();

FileInputStream infile = new FileInputStream("tempFile.buf");
ObjectInputStream objin = new ObjectInputStream(outfile);

list = new ArrayList();

while (objin.available() > 0) {
  list.add(objin.readObject());
}

objin.close();
infile.close();

The intention of the above example is to show that computationally the streams and Iterator are used similarly. This reduced snippet attempts to clarify this:

...
while (objs.hasNext()) {
  objout.writeObject(objs.next());
}
...
while (objin.available() > 0) {
  list.add(objin.readObject());
}
...

Stream Processing

Since computing has been concerned with input/output for quite some time there are a number of accepted patterns associated with stream processing. In the example above I have combined two streams FileOuputStream and ObjectOutputStream in order to serialize objects to a file.

In other situations streams are used in order to produce readable representations.

In unix it is a well understood technique to pipe output from one command into another. This is just stream processing.

Transformation

Essentially streams provide a modular mechanism to apply transformations to sequences of data.

A Striterator Is A Transforming Iterator

Firstly an Iterator is equivalent to an InputStream - there is no equivalent to an OutputStream (for now anyhow).

Consider a custom InputStream - MyInputStream, it will normally provide a constructor that takes another InputStream:

public class MyInputStream implements InputStream {

  public MyInputStream(InputStream src) {
    ...
  }
  
  ...
}

In this way MyInputStream can filter and transform any InputStream provided and can itself be used as a source for other stream transformations.

In the same way a Striterator is provided with a source Iterator:

public class Striterator implements Iterator, Enumeration {

  public Striterator(Iterator src) {
    ...
  }
  
  ...
}

If you create a Striterator like this:

Striterator striter = new Striterator(list.iterator());

..an iteration of striter will produce precisely the same result as using the list.iterator() directly, the behaviour is simply delegated to the "source" iterator.

Objects Not Data

Although you can retrieve objects from some streams, it is understood that these are built from lower-level data.

An Iterator on the other hand is a protocol that only delivers objects.

So the kind of transformations that are required are probably quite different from what we might consider "normal" stream processing.

Cut The Crap - Give Me An Example!

Okay, suppose you have a list of objects - say customer records - and you want to print out the names of all the customers. The code for this might look like this:

void printCustomerNames(Iterator customers) {
  while (customers.hasNext()) {
    System.out.println(((Customer) customers.next()).getName());
  }
}

That all seems nice and easy, but maybe we have some other requirement to print Supplier names:

void printSupplierNames(Iterator suppliers) {
  while (suppliers.hasNext()) {
    System.out.println(((Supplier) suppliers.next()).getName());
  }
}

You have to bear with this a while, since this is a necessarily simple example I don't want anyone jumping up and down and pointing out that both Supplier and Customer could both implement an INamedObject interface.

How about the following method:

void printOut(Iterator strings) {
  while (strings.hasNext()) {
    System.out.println(strings.next());
  }
}

In order to feed this method, we need to supply an iterator of string values, here is one way to do it:

void printSupplierNames(Iterator suppliers) {
  ArrayList names = new ArrayList();
  while (suppliers.hasNext()) {
    names.add(((Supplier) suppliers.next()).getName());
  }
  
  printOut(names.iterator());
}

Okay, this is a little over-the-top for our example, but if you could imagine that printOut was really a fairly complex method it may be worth considering.

Let's rationalize this a little:

Iterator getSupplierNames(Iterator suppliers) {
  ArrayList names = new ArrayList();
  while (suppliers.hasNext()) {
    names.add(((Supplier) suppliers.next()).getName());
  }
  
  return names.iterator();
}

void printSupplierNames() {
  printOut(getSupplierNames());
}

Okay, that should be starting to look a little more promising. We now have a method - getSupplierNames - that we could call in a number of places to pass on to other methods. We have definitely added some "modularity" to our system.

...but

We have had to create an intermediate ArrayList object to hold the values before creating the resulting Iterator.

We want to find a way to deliver an Iterator that performs this transformation incrementally, in the same way that Streams interact. Here is the simple - but excellent! - idea.

Iterator getSupplierNames(final Iterator suppliers) {
  return new Iterator() {
    public boolean hasNext() {
      return suppliers.hasNext();
    }
    public Object next() {
      return ((Supplier) suppliers.next()).getName();
    }
    public void remove() {
      suppliers.remove();
    }
  };
}

An important point here is that the suppliers parameter is declared to be final, this allows it to be accessed within the anonymous Iterator we defined.

Take a while to look at that code. The returned Iterator transforms the original iteration of Supplier objects and "resolves" each to the value of the getName method.

It achieves this incrementally, and there is no significant performance penalty in using this iteration over the custom code we began with - no matter how large the iteration!.

Is That It?

No, but I hope you agree that even that is pretty neat.

Let's evolve this idea a little further:

The Resolver

We have identified one of the key transforming patterns - resolving. Let's encapsulate this pattern in a class:

public abstract class Resolver implements Iterator {
  Iterator m_src;
  public Resolver(Iterator src) {
    m_src = src;
  }
  public boolean hasNext() {
    return m_src.hasNext();
  }
  public Object next() {
    return resolve(m_src.next());
  }
  public void remove() {
    m_src.remove();
  }
  protected abstract Object resolve(Object src);
}

Iterator getSupplierNames(Iterator suppliers) {
  return new Resolver(suppliers) {
    protected Object resolve(Object src) {
      return ((Supplier) src).getName();
    }
  };
}

We are now starting to get close to the Striterator, which is just a few more evolution steps along the way.

Other Patterns

To discover striteration patterns, simply examine your own iterations and consider what is going on. You should find the following examples:

The Filter

Validity checking - should the "current" object be processed?

A very common check is on the class of the object.

  if (obj instanceof MyClass) {
    process(obj);
  }

The Expander

Where the value of the iteration is expanded into a further iteration.

  Iterator children = ((Parent) obj).getChildren();
  while (children.hasNext()) {
    ...

The Appender

Appending is not something that is initially evident, but once you begin to use the other patterns - to check, resolve and expand - it becomes clear that on occasion there is a need to concatenate two iterations.

The Visitor

In languages like Lisp mapping functions have always been recognized for their power.

More recently, the so called Visitor pattern has been popularised to do something similar.

The Visitor is a side-effecting pattern that carries out some processing when passed each member of the iteration.

Pattern Implementation

Before developing the different pattern implementation classes we shall code some utility classes.

EnumIterator

This utility class simply wraps an Enumeration object to provide an Iterator

public class EnumIterator implements Iterator {
  Enumeration m_src;
  
  public EnumIterator(Enumeration src) {
    m_src = src;
  }
  
  public hasNext() {
    return m_src.hasMoreElements();
  }
  
  public Object next() {
    return m_src.nextElement();
  }
  
  public void remove() {
    throw new UnsupportedOperationException();
  }
}

WrappedIterator

Wraps an Iterator object implementing both Iterator and Enumeration interfaces.

public class WrappedIterator implements Iterator, Enumeration {
  Iterator m_src;
  
  public WrappedIterator(Iterator src) {
    m_src = src;
  }
  public WrappedIterator(Enumeration src) {
    m_src = new EnumIterator(src);
  }
  
  // Iterator
  public hasNext() {
    return m_src.hasNext();
  }
  
  public Object next() {
    return m_src.next();
  }
  
  public void remove() {
    throw m_src.remove();
  }
  
  // Enumeration
  public hasMoreElements() {
    return hasNext();
  }
  
  public Object nextElement() {
    return next();
  }
}

EmptyIterator

Rather than returning null, an Iterator returning function should always return a valid Iterator

public class EmptyIterator implements Iterator {

public final static Iterator s_standard = new EmptyIterator();

  public EmptyItertator() {}
  
  public hasNext() {
    return false;
  }
  
  public Object next() {
    throw NoSuchElementException();
  }
  
  public void remove() {
    throw NoSuchElementException();
  }
}

SingleValueIterator

Occasionally we may have only a single value to return, a SingleValueIterator supports this pattern in a straightforward way.

public class SingleValueIterator implements Iterator {
  Object m_src;
  boolean m_dataAvail = true;
  
  public SingleValueIterator(Object src) {
    m_src = src;
  }
  
  public boolean hasNext() {
    return m_dataAvail;
  }
  
  public Object next() {
    if (m_dataAvail) {
      m_dataAvail = false;
      
      return m_src;
    } else {
      throw new NoSuchElementException();
    }
  }

  public void remove() {
    throw UnsupportedOperationException();
  }
}

Pattern Implementations

Here are anstract classes for the patterns discussed above.

Resolver

We'll recode the Resolver class to use the WrappedIterator:

public abstract class Resolver extends WrappedIterator {

	public Resolver(Iterator src) {
	  super(src);
	}
	
  final public Object next() {
    return resolve(super.next());
  }
  
  protected abstract Object resolve(Object src);
}

Filter

The Filter class needs to read ahead to determine if there exists a valid next object.

public abstract class Filter extends WrappedIterator {
  Object m_current = null;
  boolean m_hasNext = false;
  
  public Filter(Iterator src) {
    super(src);
    
    checkNext();
  }
  
  private void checkNext() {
    m_hasNext = false;
    while (super.hasNext()) {
      m_current = super.next();
      m_hasNext = isValid(m_current);
      if (m_hasNext) {
        break;
      }
    }
  }
  
  protected boolean isValid(Object src);
  
  public boolean hasNext() {
    return m_hasNext;
  }
  
  public Object next() {
    if (!m_hasNext) {
      throw NoSuchElementException();
    } else {
      Object ret = m_current;
      
      checkNext();
      
      return ret;
    }
  }
}

Expander

The Expander maintains a "child" iterator for the current element. Like the Filter class it is required to do some eager computation to support the hasNext protocol.

public abstract class Expander extends WrappedIterator {
  Iterator m_child = null;
  
  public Expander(Iterator src) {
    super(src);
    
    checkNext();
  }
  
  private void checkNext() {
    while (!m_child.hasNext()) {
      if (super.hasNext()) {
        m_child = expand(super.next());
      }
    }
  }        
  
  protected abstract Iterator expand();
  
  public boolean hasNext() {
    return m_child.hasNext();
  }
  
  public Object next() {
    Object ret = m_child.next();
    
    checkNext();
    
    return ret;
  }
  
  public void remove() {
    m_child.remove();
  }
}

Appender

The Appender just needs to overide next to check when to append the second iterator.

public abstract class Appender extends WrappedIterator {
  Iterator m_other = other;
  
  public Appender(Iterator src, Iterator other) {
    super(src);
    
    m_other = other;
    
    if (!hasNext()) {
      replace();
    }
  }
  
  private void replace() {
    m_src = m_other;
    m_other = null;
  }
  
  public Object next() {
    Object ret = super.next();
    
    if (!hasNext()) {
      replace();
    }
    
    return ret;
  }
}

The Striterator

The Striterator adds an extra twist to the code we have developed so far.

It supports the "injection" of transformers, as an alternative to always needing to create a new Iterator from an existing Iterator, for example:

Iterator process(Iterator src) {
  Iterator iter1 = new Resolver(src) {
    protected Object resolve(Object src) {
      ...
    }
  };
  
  Iterator iter2 = new Expander(iter1) {
    protected Iterator expand(Object src) {
      ...
    }
  };
  
  return iter2;
}

With a Striterator the following is written:

Striterator process(Striterator src) {
  src.inject(new Resolver() {
    protected Object resolve(Object src) {
      ...
    }
  } );
  
  src.inject(new Expander() {
    protected Iterator expand(Object src) {
      ...
    }
  } );
  
  return src;
}

For this to work properly, we need to add a new method to WrappedIterator - setSource:

  public void setSource(Iterator src) {
    m_src = src;
  }

- and we will need alternative constructors for the pattern implementation classes. Then the inject method will look like this:

  public Striterator inject(WrappedIterator iter) {
    iter.setSource(m_src);
    m_src = iter;
    
    return this;
  }

Here is the code for a utility append method:

  public Striterator append(Iterator iter) {
    return inject(new Appender(iter));
  }

..and that for a addTypeFilter method:

  public Striterator addTypeFilter(final Class cls) {
    return inject(new Filter() {
      protected boolean isValid(Object obj) {
        return cls.isInstance(obj);
      }
    } );
    
    return this;
  }

Method Chaining

The reason that the utility and inject methods return the Striterator is to support method chaining, which can allow for more concise coding. For example:

public Striterator getAllData() {
  return getOneSrc().append(getOtherSrc()).append(getLastSrc());
}

...sometimes its just nice not to have to think up variable names.

Typed Striterators

You may have noticed that the main Striteration patterns have simply derived from WrappedIterator and not Striterator, the Striterator class seems only to be used for injecting and utility methods. This is true. But by always returning Striterators rather than standard Iterators this is still an advantage.

One pattern that does derive from Striterator is the "Typed Striterators".

Consider the following:

public class StringIterator implements Striterator {
  public StringIterator(Iterator src) {
    super(src);
  }
  
  public String nextString() {
    return (String) next();
  }
}

Now, it should be pretty clear that this kind of thing is going to be a lot easier once we get generic functions in java, but, in fact it is not hard anyway, and by using typed Striterators we can add compile-time type checking as well as avoiding casting in out own code:

public class CustomerIterator extends Striterator {
 
  public CustomerIterator(Iterator src) {
    super(src);
  }
  
  public Customer nextCustomer() {
    return (Customer) next();
  }
}

We can now write code like this:

CustomerIterator getCustomers() {
  return new CustomerIterator(m_customers.iterator());
}

and users of the method get a convenient access method to the data within:

CustomerIterator custs = getCustomersp();

while (custs.hasNext() {
  System.out.println(custs.nextCustomer().getName());
}

Integration With Object Models

There has always been one major advantage to returning Iterator objects over Lists and Sets and that is scalability.

An iterator may be active over millions of objects and may be considered analagous to a database cursor object.

By using Striterators rather than simple Iterators you are creating structures that can be transformed incrementally using the techniques described here. This is a very powerful and expressive computational pattern.

Can I Use This Stuff Now??

Yes, the Striterators are part of the utilities package released under GNU LGPL and can be downloaded from www.cutthecrap.biz/software/downloads.html along with other Cut The Crap software.