body { margin:0; padding:0; font-family:times; font-size:9pt; } p { margin-top:5; margin-bottom:5; font-weight:plain; font-family:serif; font-size:9pt; } h1 { margin-top:5; font-size:16pt; font-weight:bold; text-align:center; font-family:sans-serif; } h2 { margin-top:20; font-size:10pt; font-weight:bold; text-align:left; font-family:sans-serif; } h3 { margin-top:20; font-size:9pt; font-weight:bold; text-align:left; font-family:sans-serif; } pre { color:#993333; font-size:9pt; } code { color:#993333; } .section { text-align:justify; padding:5; } .note { font-size:8pt; font-style:italic; color:#000000; padding-left:5; padding-right:5; padding-top:5; padding-bottom:5; background-color:#CCCCCC }
A Striterator implements the standard Iterator and
Enumeration interfaces while providing functionality normally
associated with stream processing.
Unlike a Set or List object, an Iterator delivers
one object at a time. The next method is directly analagous to the
readObject method of an ObjectInputStream. Consider this example code:
ArrayList list;
FileOutputStream outfile = new FileOutputStream("tempFile.buf");
ObjectOutputStream objout = new ObjectOutputStream(outfile);
Iterator objs = list.iterator();
while (objs.hasNext()) {
objout.writeObject(objs.next());
}
objout.close();
outfile.close();
FileInputStream infile = new FileInputStream("tempFile.buf");
ObjectInputStream objin = new ObjectInputStream(outfile);
list = new ArrayList();
while (objin.available() > 0) {
list.add(objin.readObject());
}
objin.close();
infile.close();
The intention of the above example is to show that computationally the streams and
Iterator are used similarly. This reduced snippet attempts to clarify this:
...
while (objs.hasNext()) {
objout.writeObject(objs.next());
}
...
while (objin.available() > 0) {
list.add(objin.readObject());
}
...
Since computing has been concerned with input/output for quite some time there are a number
of accepted patterns associated with stream processing. In the example above I have combined
two streams FileOuputStream and ObjectOutputStream in order to
serialize objects to a file.
In other situations streams are used in order to produce readable representations.
In unix it is a well understood technique to pipe output from
one command into another. This is just stream processing.
Essentially streams provide a modular mechanism to apply transformations to sequences of data.
Firstly an Iterator is equivalent to an InputStream - there is
no equivalent to an OutputStream (for now anyhow).
Consider a custom InputStream - MyInputStream, it will normally
provide a constructor that takes another InputStream:
public class MyInputStream implements InputStream {
public MyInputStream(InputStream src) {
...
}
...
}
In this way MyInputStream can filter and transform any InputStream
provided and can itself be used as a source for other stream transformations.
In the same way a Striterator is provided with a source Iterator:
public class Striterator implements Iterator, Enumeration {
public Striterator(Iterator src) {
...
}
...
}
If you create a Striterator like this:
Striterator striter = new Striterator(list.iterator());
..an iteration of striter will produce precisely the same result
as using the list.iterator() directly, the behaviour is simply
delegated to the "source" iterator.
Although you can retrieve objects from some streams, it is understood that these are built from lower-level data.
An Iterator on the other hand is a protocol that only delivers objects.
So the kind of transformations that are required are probably quite different from what we might consider "normal" stream processing.
Okay, suppose you have a list of objects - say customer records - and you want to print out the names of all the customers. The code for this might look like this:
void printCustomerNames(Iterator customers) {
while (customers.hasNext()) {
System.out.println(((Customer) customers.next()).getName());
}
}
That all seems nice and easy, but maybe we have some other requirement to print Supplier names:
void printSupplierNames(Iterator suppliers) {
while (suppliers.hasNext()) {
System.out.println(((Supplier) suppliers.next()).getName());
}
}
You have to bear with this a while, since this is a necessarily
simple example I don't want anyone jumping up and down and pointing out that
both Supplier and Customer could both implement an
INamedObject interface.
How about the following method:
void printOut(Iterator strings) {
while (strings.hasNext()) {
System.out.println(strings.next());
}
}
In order to feed this method, we need to supply an iterator of string values, here is one way to do it:
void printSupplierNames(Iterator suppliers) {
ArrayList names = new ArrayList();
while (suppliers.hasNext()) {
names.add(((Supplier) suppliers.next()).getName());
}
printOut(names.iterator());
}
Okay, this is a little over-the-top for our example, but if you could imagine that
printOut was really a fairly complex method it may be worth considering.
Let's rationalize this a little:
Iterator getSupplierNames(Iterator suppliers) {
ArrayList names = new ArrayList();
while (suppliers.hasNext()) {
names.add(((Supplier) suppliers.next()).getName());
}
return names.iterator();
}
void printSupplierNames() {
printOut(getSupplierNames());
}
Okay, that should be starting to look a little more promising. We now have
a method - getSupplierNames - that we could call in a number of
places to pass on to other methods. We have definitely added some "modularity"
to our system.
We have had to create an intermediate ArrayList object to hold the
values before creating the resulting Iterator.
We want to find a way to deliver an Iterator that performs this
transformation incrementally, in the same way that Streams interact. Here is the
simple - but excellent! - idea.
Iterator getSupplierNames(final Iterator suppliers) {
return new Iterator() {
public boolean hasNext() {
return suppliers.hasNext();
}
public Object next() {
return ((Supplier) suppliers.next()).getName();
}
public void remove() {
suppliers.remove();
}
};
}
An important point here is that the suppliers parameter
is declared to be final, this allows it to be accessed within the
anonymous Iterator we defined.
Take a while to look at that code. The returned Iterator transforms the
original iteration of Supplier objects and "resolves" each to the value of
the getName method.
It achieves this incrementally, and there is no significant performance penalty in using this iteration over the custom code we began with - no matter how large the iteration!.
No, but I hope you agree that even that is pretty neat.
Let's evolve this idea a little further:
We have identified one of the key transforming patterns - resolving. Let's encapsulate this pattern in a class:
public abstract class Resolver implements Iterator {
Iterator m_src;
public Resolver(Iterator src) {
m_src = src;
}
public boolean hasNext() {
return m_src.hasNext();
}
public Object next() {
return resolve(m_src.next());
}
public void remove() {
m_src.remove();
}
protected abstract Object resolve(Object src);
}
Iterator getSupplierNames(Iterator suppliers) {
return new Resolver(suppliers) {
protected Object resolve(Object src) {
return ((Supplier) src).getName();
}
};
}
We are now starting to get close to the Striterator, which is just
a few more evolution steps along the way.
To discover striteration patterns, simply examine your own iterations
and consider what is going on. You should find the following examples:
Validity checking - should the "current" object be processed?
A very common check is on the class of the object.
if (obj instanceof MyClass) {
process(obj);
}
Where the value of the iteration is expanded into a further iteration.
Iterator children = ((Parent) obj).getChildren();
while (children.hasNext()) {
...
Appending is not something that is initially evident, but once you begin to use the other patterns - to check, resolve and expand - it becomes clear that on occasion there is a need to concatenate two iterations.
In languages like Lisp mapping functions have always been recognized
for their power.
More recently, the so called Visitor pattern has been popularised to
do something similar.
The Visitor is a side-effecting pattern that carries out some processing
when passed each member of the iteration.
Before developing the different pattern implementation classes we shall code some utility classes.
This utility class simply wraps an Enumeration object to provide an
Iterator
public class EnumIterator implements Iterator {
Enumeration m_src;
public EnumIterator(Enumeration src) {
m_src = src;
}
public hasNext() {
return m_src.hasMoreElements();
}
public Object next() {
return m_src.nextElement();
}
public void remove() {
throw new UnsupportedOperationException();
}
}
Wraps an Iterator object implementing both Iterator and
Enumeration interfaces.
public class WrappedIterator implements Iterator, Enumeration {
Iterator m_src;
public WrappedIterator(Iterator src) {
m_src = src;
}
public WrappedIterator(Enumeration src) {
m_src = new EnumIterator(src);
}
// Iterator
public hasNext() {
return m_src.hasNext();
}
public Object next() {
return m_src.next();
}
public void remove() {
throw m_src.remove();
}
// Enumeration
public hasMoreElements() {
return hasNext();
}
public Object nextElement() {
return next();
}
}
Rather than returning null, an Iterator returning function
should always return a valid Iterator
public class EmptyIterator implements Iterator {
public final static Iterator s_standard = new EmptyIterator();
public EmptyItertator() {}
public hasNext() {
return false;
}
public Object next() {
throw NoSuchElementException();
}
public void remove() {
throw NoSuchElementException();
}
}
Occasionally we may have only a single value to return, a SingleValueIterator
supports this pattern in a straightforward way.
public class SingleValueIterator implements Iterator {
Object m_src;
boolean m_dataAvail = true;
public SingleValueIterator(Object src) {
m_src = src;
}
public boolean hasNext() {
return m_dataAvail;
}
public Object next() {
if (m_dataAvail) {
m_dataAvail = false;
return m_src;
} else {
throw new NoSuchElementException();
}
}
public void remove() {
throw UnsupportedOperationException();
}
}
Here are anstract classes for the patterns discussed above.
We'll recode the Resolver class to use the WrappedIterator:
public abstract class Resolver extends WrappedIterator {
public Resolver(Iterator src) {
super(src);
}
final public Object next() {
return resolve(super.next());
}
protected abstract Object resolve(Object src);
}
The Filter class needs to read ahead to determine if there exists a valid
next object.
public abstract class Filter extends WrappedIterator {
Object m_current = null;
boolean m_hasNext = false;
public Filter(Iterator src) {
super(src);
checkNext();
}
private void checkNext() {
m_hasNext = false;
while (super.hasNext()) {
m_current = super.next();
m_hasNext = isValid(m_current);
if (m_hasNext) {
break;
}
}
}
protected boolean isValid(Object src);
public boolean hasNext() {
return m_hasNext;
}
public Object next() {
if (!m_hasNext) {
throw NoSuchElementException();
} else {
Object ret = m_current;
checkNext();
return ret;
}
}
}
The Expander maintains a "child" iterator for the current element. Like
the Filter class it is required to do some eager computation to support the
hasNext protocol.
public abstract class Expander extends WrappedIterator {
Iterator m_child = null;
public Expander(Iterator src) {
super(src);
checkNext();
}
private void checkNext() {
while (!m_child.hasNext()) {
if (super.hasNext()) {
m_child = expand(super.next());
}
}
}
protected abstract Iterator expand();
public boolean hasNext() {
return m_child.hasNext();
}
public Object next() {
Object ret = m_child.next();
checkNext();
return ret;
}
public void remove() {
m_child.remove();
}
}
The Appender just needs to overide next to check when
to append the second iterator.
public abstract class Appender extends WrappedIterator {
Iterator m_other = other;
public Appender(Iterator src, Iterator other) {
super(src);
m_other = other;
if (!hasNext()) {
replace();
}
}
private void replace() {
m_src = m_other;
m_other = null;
}
public Object next() {
Object ret = super.next();
if (!hasNext()) {
replace();
}
return ret;
}
}
The Striterator adds an extra twist to the code we have developed so far.
It supports the "injection" of transformers, as an alternative to always needing to create
a new Iterator from an existing Iterator, for example:
Iterator process(Iterator src) {
Iterator iter1 = new Resolver(src) {
protected Object resolve(Object src) {
...
}
};
Iterator iter2 = new Expander(iter1) {
protected Iterator expand(Object src) {
...
}
};
return iter2;
}
With a Striterator the following is written:
Striterator process(Striterator src) {
src.inject(new Resolver() {
protected Object resolve(Object src) {
...
}
} );
src.inject(new Expander() {
protected Iterator expand(Object src) {
...
}
} );
return src;
}
For this to work properly, we need to add a new method to WrappedIterator
- setSource:
public void setSource(Iterator src) {
m_src = src;
}
- and we will need alternative constructors for the pattern implementation classes. Then the
inject method will look like this:
public Striterator inject(WrappedIterator iter) {
iter.setSource(m_src);
m_src = iter;
return this;
}
Here is the code for a utility append method:
public Striterator append(Iterator iter) {
return inject(new Appender(iter));
}
..and that for a addTypeFilter method:
public Striterator addTypeFilter(final Class cls) {
return inject(new Filter() {
protected boolean isValid(Object obj) {
return cls.isInstance(obj);
}
} );
return this;
}
The reason that the utility and inject methods return the
Striterator is to support method chaining, which can allow for
more concise coding. For example:
public Striterator getAllData() {
return getOneSrc().append(getOtherSrc()).append(getLastSrc());
}
...sometimes its just nice not to have to think up variable names.
You may have noticed that the main Striteration patterns have simply
derived from WrappedIterator and not Striterator,
the Striterator class seems only to be used for injecting and
utility methods. This is true. But by always returning Striterators rather
than standard Iterators this is still an advantage.
One pattern that does derive from Striterator is the "Typed Striterators".
Consider the following:
public class StringIterator implements Striterator {
public StringIterator(Iterator src) {
super(src);
}
public String nextString() {
return (String) next();
}
}
Now, it should be pretty clear that this kind of thing is going to be a lot easier once
we get generic functions in java, but, in fact it is not hard anyway, and by using typed
Striterators we can add compile-time type checking as well as avoiding
casting in out own code:
public class CustomerIterator extends Striterator {
public CustomerIterator(Iterator src) {
super(src);
}
public Customer nextCustomer() {
return (Customer) next();
}
}
We can now write code like this:
CustomerIterator getCustomers() {
return new CustomerIterator(m_customers.iterator());
}
and users of the method get a convenient access method to the data within:
CustomerIterator custs = getCustomersp();
while (custs.hasNext() {
System.out.println(custs.nextCustomer().getName());
}
There has always been one major advantage to returning Iterator objects
over Lists and Sets and that is scalability.
An iterator may be active over millions of objects and may be considered analagous to
a database cursor object.
By using Striterators rather than simple Iterators you are creating
structures that can be transformed incrementally using the techniques described here. This is
a very powerful and expressive computational pattern.
Yes, the Striterators are part of the utilities package released under GNU LGPL and can be downloaded from www.cutthecrap.biz/software/downloads.html along with other Cut The Crap software.