Hide and Seek with Collections in Java
Revisiting the Collection Accessor Method pattern with Eclipse Collections.
Relationships are complicated
This blog is a thought experiment. I’m revisiting some classic ideas and exploring some new ideas. I have some gut feelings, that I am trying to validate through code examples. Patience. I think there is something important in here.
Change happens, and it can be challenging to manage. How we manage change safely is one of the key problems to solve when modeling an object-oriented domain. Relationships between classes that represent the“many” are often modeled in memory with collections. In this blog I will explore the simple model of Order
and LineItem
. Order
has a zero to many relationship to LineItem
.
We have choices when designing classes with relationships. We have to decide how to represent that relationship in Java. Do we use a List
, a Set
, a Bag
, a Stack
, or a Map
? Should we make the relationship known publicly or keep it private? How do we share knowledge about the objects contained in the relationship with the outside world?
There are many different ways to implement the Order
class in this diagram. The LineItem
class is much simpler. We will just use a Java record and make its state immutable.
We will explore three ways to safely model and implement an object-oriented domain using collections in this blog.
- Collection Accessor Method (from two books by Kent Beck)
- Immutable Collections
- Readable Collections
All three approaches share one common rule. Never, ever, ever expose a mutable collection interface as a public method on a domain class. A class should not allow clients to modify its internal state directly. This rule may sound like logical advice until we realize that every interface in the Java Collection Framework is mutable. Ruh-roh! This means every domain modeled and exposing Java collection types is inherently flawed and unsafe. Ouch. An inconvenient truth for Java developers everywhere.
Thus begins our game of hide and seek with Java collections.
Collection Accessor Method
The idea of encapsulating generic collections behind a domain protected by intention revealing methods is quite old. I was able to trace back this commonsense advice to a pattern named Collection Accessor Method in two books that Kent Beck wrote. The first book is Smalltalk Best Practice Patterns (SBPP, p.96), which has been around since 1997. There is great advice in the book for Smalltalk developers, and honestly, developers working in any OO language. The second book is Implementation Patterns (IP, p.91), which was published in 2007. This book revisits the patterns from SBPP and tailors them with great advice for Java developers.
Both books agree that the simplest way to provide access to the elements in a collection is to expose the collection via a getter method. This is how the trouble with change starts. Both books explain the downsides of clients being able to modify mutable collections outside of the context of the containing class. The books also recommend a better alternative.
Instead, give clients restricted access to operations on the collection through messages that you implement. (Smalltalk Best Practice Patterns, Kent Beck)
Instead, offer methods that provide limited, meaningful access to the information in the collections. (Implementation Patterns, Kent Beck)
Six one way, half a dozen the other.
When SBPP and IP were written, the collection options that were available to both Smalltalk and Java developers were mutable collections. I wrote some quick examples of encapsulating collections in Java using mutable collections. The domain is simple. An Order
has zero or more LineItem
instances maintained in some collection type.
Clients do not need to know the type of collection. The Collection
could be a List
, Set
, Stack
, Bag
or a simple array and the clients wouldn’t know the difference. In this iteration of the problem, I am using a MutableList
from Eclipse Collections. All access to the MutableList
is protected behind methods exposed on the Order
class. LineItem
is simply a record class in Java with a String
name and double
value.
public class Order
{
private final MutableList<LineItem> lineItems = Lists.mutable.empty();
public Order addLineItem(String name, double value)
{
this.lineItems.add(new LineItem(name, value));
return this;
}
public void forEachLineItem(Procedure<LineItem> procedure)
{
this.lineItems.forEach(procedure);
}
public int totalLineItemCount()
{
return this.lineItems.size();
}
public int countOfLineItem(String name)
{
return this.lineItems.count(lineItem -> lineItem.name().equals(name));
}
public double totalOrderValue()
{
return this.lineItems.sumOfDouble(LineItem::value);
}
}
public record LineItem(String name, double value) {}
For this simple use case, where querying the collection of lineItems
is limited, this works well. The following test class shows the usage of Order
with LineItem
and the various accessor methods.
public class OrderTest
{
private final Order order =
new Order().addLineItem("Cup", 5.50)
.addLineItem("Plate", 7.50)
.addLineItem("Fork", 3.00)
.addLineItem("Spoon", 2.50)
.addLineItem("Knife", 3.50);
@Test
public void lineItemCount()
{
Assertions.assertEquals(5, this.order.totalLineItemCount());
}
@Test
public void totalOrderValue()
{
Assertions.assertEquals(22.0, this.order.totalOrderValue());
}
@Test
public void forEachLineItem()
{
StringJoiner join = new StringJoiner(",");
this.order.forEachLineItem(lineItem -> join.add(lineItem.name()));
Assertions.assertEquals(
"Cup,Plate,Fork,Spoon,Knife",
join.toString());
}
@Test
public void countOfLineItem()
{
Assertions.assertEquals(1, this.order.countOfLineItem("Plate"));
Assertions.assertEquals(1, this.order.countOfLineItem("Fork"));
Assertions.assertEquals(1, this.order.countOfLineItem("Spoon"));
Assertions.assertEquals(0, this.order.countOfLineItem("Napkin"));
}
}
There is a downside to this specific implementation of Order
I have used with Collection Accessor Method. Order
can be mutated, and it is not thread-safe. We could create a synchronized
version of Order
called SynchronizedOrder
, which will protect read and write method calls from stepping on each other. Since we are using Collection Accessor Methods to protect all calls to the contained list, we can simply synchronize each method that accesses it.
public class SynchronizedOrder
{
private final MutableList<LineItem> lineItems = Lists.mutable.empty();
public synchronized SynchronizedOrder addLineItem(String name, double value)
{
this.lineItems.add(new LineItem(name, value));
return this;
}
public synchronized void forEachLineItem(Procedure<LineItem> procedure)
{
this.lineItems.forEach(procedure);
}
public synchronized int totalLineItemCount()
{
return this.lineItems.size();
}
public synchronized int countOfLineItem(String name)
{
return this.lineItems.count(lineItem -> lineItem.name().equals(name));
}
public synchronized double totalOrderValue()
{
return this.lineItems.sumOfDouble(LineItem::value);
}
}
This will start to be overkill as we add more methods. There is also the challenge of testing each method we add for thread-safety, and the possibility that in the future someone might remove or forget to add the synchronized
keyword to one of the methods.
We can make Order
thread-safe more simply by changing the following definition of lineItems
in the Order
class.
We can create aSynchronizedList
by calling asSynchronized
on an empty MutableList
.
private final MutableList<LineItem> lineItems =
Lists.mutable.empty().asSynchronized();
A SynchronizedList
will block readers and writers accessing the list. The one place in a SynchronizedList
that is not protected is iterator
. It is up to clients to synchronize access to the collection manually when using an Iterator
. This is not a problem for this particular use case, as we do not use Iterator
inside of Order
, and we do not expose the lineItems
to clients.
Eclipse Collections has another alternative for a thread-safe List. We can use a MultiReaderList
.
private final MutableList<LineItem> lineItems =
Lists.multiReader.empty();
MultiReaderList
works as its name implies. Multiple readers can access query methods on the List
, and a single writer will block all readers when accessing mutating methods like add
or remove
.
The nice thing about using Collection Accessor Method is that we can change things in the internal implementation without impacting our clients. This is using encapsulation as it is intended for goodness.
Immutable Collections
We have additional options available in Java via Eclipse Collections, that were not available in open source in 2007 when Implementation Patterns was published. While encapsulating mutable collections via Collection Accessor Method is preferable in all cases that I can think of, there are situations where exposing a “read-only” version of the Collection may reduce the number of collection query methods that have to be provided on a type. There is a tradeoff of protection of a types collection and the flexibility of being able to use high level collection query methods without replicating the entire rich API of a collection library like Eclipse Collections.
If you only have Mutable Collection types, then the Collection Accessor Method pattern is your best option. Exposing collection via unmodifiable wrappers or exposing just Iterator
with the remove method forcefully removed is awful at best, and downright dangerous at worst. Given a choice, I would never expose Iterator
on a Java Collection
anywhere. Iterator
is a concurrency nightmare. I always prefer internal iterators like forEach
. These can be protected appropriately by the collection type. Iterator
cannot be protected by anyone other than the developer who is using it. This requires intimate knowledge on the part of the consumer of a collection and trust on the part of the provider of the collection.
Trust no one. Use contractually and structurally Immutable Collection types. Do not implicitly trust “immutable” collections masquerading unsafely behind mutable interfaces.
If we convert the domain of Order
and LineItem
to use Immutable Collection types, we can remove the trust factor from the equation. Consumers can trust that the collection won’t change. Providers can trust that clients have no direct way to modify the collections.
The following is an implementation using a Java record to define Order
with an ImmutableBag<LineItem>
for the collection of lineItems
. LineItem
is also a Java record.
public record Order(ImmutableBag<LineItem> lineItems)
{
public Order()
{
this(Bags.immutable.empty());
}
public Order addLineItem(String name, double value)
{
return new Order(lineItems.newWith(new LineItem(name, value)));
}
public double totalOrderValue()
{
return this.lineItems.sumOfDouble(LineItem::value);
}
}
public record LineItem(String name, double value) {}
There is an empty constructor here so an Order
can be constructed without passing an ImmutableBag<LineItem>
. This is for convenience so we can emulate the building of an immutable Order
by using an addLineItem
method that looks very similar to the method we used with Collection Accessor Method.
The unit test changes as follows with this new definition of Order
.
public class OrderTest
{
private final Order order =
new Order().addLineItem("Cup", 5.50)
.addLineItem("Plate", 7.50)
.addLineItem("Fork", 3.00)
.addLineItem("Spoon", 2.50)
.addLineItem("Knife", 3.50);
@Test
public void lineItemCount()
{
assertEquals(5, this.order.lineItems().size());
}
@Test
public void totalOrderValue()
{
assertEquals(22.0, order.totalOrderValue());
}
@Test
public void forEachLineItem()
{
StringJoiner joiner = new StringJoiner(",");
this.order.lineItems()
.toSortedListBy(LineItem::name)
.forEach(lineItem -> joiner.add(lineItem.name()));
assertEquals("Cup,Fork,Knife,Plate,Spoon", joiner.toString());
}
@Test
public void makeStringLineItems()
{
assertEquals(
"Cup,Fork,Knife,Plate,Spoon",
this.order.lineItems()
.toSortedListBy(LineItem::name)
.makeString(LineItem::name, "", ",", ""));
}
@Test
public void countOfLineItem()
{
Bag<String> counts =
this.order.lineItems().countBy(LineItem::name);
assertEquals(1, counts.occurrencesOf("Plate"));
assertEquals(1, counts.occurrencesOf("Fork"));
assertEquals(1, counts.occurrencesOf("Spoon"));
assertEquals(0, counts.occurrencesOf("Napkin"));
}
@Test
public void iteratorRemoveThrowsOnImmutableBag()
{
assertThrows(
UnsupportedOperationException.class,
() -> this.order.lineItems().iterator().remove());
}
}
While the calls to addLineItem
looks very similar between the mutable and immutable cases, the results are very different. Each call to addLineItem
in the immutable case creates a new Order
with a new ImmutableBag<LineItem>
. Look closely at the code for addLineItem
and you will see the difference. The client code looks very similar.
We can safely expose the lineItems
to clients here because ImmutableBag
is both contractually and structurally immutable. There are no mutating methods on ImmutableBag
like add
or remove
. The Order
in the immutable case is thread-safe because the lineItems
collection cannot be modified and cause any race conditions. This reduces the amount of query code that we need to write on Order
, and increases the total amount of behavior clients can use when querying the lineItems
. I will leave it to the judgement of the reader what makes sense to still provide on Order
using Collection Accessor Method.
Readable Collections
Using Collection Accessor Method completely, or exposing explicit access to an ImmutableList<LineItem>
are two safe approaches to modeling an object-oriented domain with collections. I prefer the Immutable Collection approach, but there is another solution that sits somewhere between Collection Accessor Method and Immutable Collections which I call Readable Collections.
The design of collection types in Eclipse Collections uses a predicatable triad of interfaces. The interfaces are Mutable, Immutable, and Readable. The Readable interfaces sometimes end in the suffix Iterable
, so as not to collide with the mutable collection types defined in java.util
like List
, Set
, Map
.
The type RichIterable
here is “Readable”. The type only has read-only methods that do not mutate the collection. The one exception to this is the method remove on Iterator which can be accessed after calling iterator. This is a design tradeoff that was made so RichIterable
can be used in Java 5 style foreach loops by extending the java.lang.Iterable
interface.
We can use the Collection Accessor Method approach to protect modification through the Order
interface, but we can expose a “readable” interface like RichIterable
for the lineItems
.
The one additional trick is to wrap the lineItems in an Unmodifiable wrapper calling asUnmodifiable
. This guarantees that some client will not be able to modify the collection directly by casting the type to a Mutable type. As it turns out, there is no risk of a client calling lineItems().iterator().remove()
as MultiReaderCollection
types in Eclipse Collections do not implement iterator()
, as Iterator
instances are thread-dangerous.
The following shows how the Readable approach for lineItems
made public as RichIerable<LineItem>
is implemented.
public class Order
{
private final MultiReaderBag<LineItem> lineItems =
Bags.multiReader.empty();
public Order addLineItem(String name, double value)
{
this.lineItems.add(new LineItem(name, value));
return this;
}
public RichIterable<LineItem> lineItems()
{
return this.lineItems.asUnmodifiable();
}
public double totalOrderValue()
{
return this.lineItems.sumOfDouble(LineItem::value);
}
}
The following unit test shows how the Order
class can be used.
public class OrderTest
{
private final Order order =
new Order().addLineItem("Cup", 5.50)
.addLineItem("Plate", 7.50)
.addLineItem("Fork", 3.00)
.addLineItem("Spoon", 2.50)
.addLineItem("Knife", 3.50);
@Test
public void lineItemCount()
{
assertEquals(5, this.order.lineItems().size());
}
@Test
public void totalOrderValue()
{
assertEquals(22.0, order.totalOrderValue());
}
@Test
public void forEachLineItem()
{
StringJoiner joiner = new StringJoiner(",");
this.order.lineItems()
.toSortedListBy(LineItem::name)
.forEach(lineItem -> joiner.add(lineItem.name()));
assertEquals("Cup,Fork,Knife,Plate,Spoon", joiner.toString());
}
@Test
public void makeStringLineItems()
{
assertEquals(
"Cup,Fork,Knife,Plate,Spoon",
this.order.lineItems()
.toSortedListBy(LineItem::name)
.makeString(LineItem::name, "", ",", ""));
}
@Test
public void countOfLineItem()
{
Bag<String> counts =
this.order.lineItems().countBy(LineItem::name);
assertEquals(1, counts.occurrencesOf("Plate"));
assertEquals(1, counts.occurrencesOf("Fork"));
assertEquals(1, counts.occurrencesOf("Spoon"));
assertEquals(0, counts.occurrencesOf("Napkin"));
}
@Test
public void iteratorThrowsOnMultiReaderBag()
{
assertThrows(UnsupportedOperationException.class,
() -> this.order.lineItems().iterator());
}
}
Final Thoughts
We explored the following three approaches to modeling an OO domain with collections representing relationships between classes.
- Collection Accessor Method
- Immutable Collections
- Readable Collections
The Collection Accessor Method is the safest and most straightforward approach when modeling with Collections. It is also the least flexible. If we need more collection query methods, we have to add more methods to our domain classes providing access to these methods.
Immutable Collections give the best combination of safety and flexibility. If your application data follows a “load data up front for read-only querying” style then this approach works very well. If the data in your application is constantly changing, there is a cost to continually changing the data in immutable collections.
Readable Collections give a good combination of safety and flexibility and are better suited to occasional writes happening to application data throughout the course of the application lifecycle.
There is a fourth way (shhhh!!!)
I don’t know if we’re ready for this yet. What happens if we change the lineItems
method in the Readable Collections Order
implementation to the following:
public RichIterable<LineItem> lineItems()
{
return this.lineItems.asLazy();
}
The Order
tests still pass, with the exception of the iterator
throwing on MultiReaderBag
. The Iterator
throws on remove
instead.
Take a look at the class diagram of RichIterable
above again, and you will see RichIterable
has a subtype named LazyIterable
. LazyIterable
is not a Collection
. LazyIterable
is lazy like Stream
, but it is not “one and done” like Stream
. A LazyIterable
can be safely used over and over again, like any Iterable
.
I will leave exploring the possibilities of using LazyIterable
as an exercise for the reader.
Thanks for reading, and I hope this blog helps you find safer and better ways of modeling your object-oriented domain classes with collections.
Enjoy!
I am the creator of and committer for the Eclipse Collections OSS project, which is managed at the Eclipse Foundation. Eclipse Collections is open for contributions.