Hide and Seek with Collections in Java

Donald Raab
11 min readAug 3, 2024

--

Revisiting the Collection Accessor Method pattern with Eclipse Collections.

Photo by Nathan Dumlao on Unsplash

Relationships are complicated

This blog is a thought experiment. I’m revisiting some classic ideas and exploring some new ideas. I have some gut feelings, that I am trying to validate through code examples. Patience. I think there is something important in here.

Change happens, and it can be challenging to manage. How we manage change safely is one of the key problems to solve when modeling an object-oriented domain. Relationships between classes that represent the“many” are often modeled in memory with collections. In this blog I will explore the simple model of Order and LineItem. Order has a zero to many relationship to LineItem.

Relationship between Order and LineItem in UML

We have choices when designing classes with relationships. We have to decide how to represent that relationship in Java. Do we use a List, a Set, a Bag, a Stack, or a Map? Should we make the relationship known publicly or keep it private? How do we share knowledge about the objects contained in the relationship with the outside world?

There are many different ways to implement the Order class in this diagram. The LineItem class is much simpler. We will just use a Java record and make its state immutable.

We will explore three ways to safely model and implement an object-oriented domain using collections in this blog.

  • Collection Accessor Method (from two books by Kent Beck)
  • Immutable Collections
  • Readable Collections

All three approaches share one common rule. Never, ever, ever expose a mutable collection interface as a public method on a domain class. A class should not allow clients to modify its internal state directly. This rule may sound like logical advice until we realize that every interface in the Java Collection Framework is mutable. Ruh-roh! This means every domain modeled and exposing Java collection types is inherently flawed and unsafe. Ouch. An inconvenient truth for Java developers everywhere.

Thus begins our game of hide and seek with Java collections.

Collection Accessor Method

The idea of encapsulating generic collections behind a domain protected by intention revealing methods is quite old. I was able to trace back this commonsense advice to a pattern named Collection Accessor Method in two books that Kent Beck wrote. The first book is Smalltalk Best Practice Patterns (SBPP, p.96), which has been around since 1997. There is great advice in the book for Smalltalk developers, and honestly, developers working in any OO language. The second book is Implementation Patterns (IP, p.91), which was published in 2007. This book revisits the patterns from SBPP and tailors them with great advice for Java developers.

Both books agree that the simplest way to provide access to the elements in a collection is to expose the collection via a getter method. This is how the trouble with change starts. Both books explain the downsides of clients being able to modify mutable collections outside of the context of the containing class. The books also recommend a better alternative.

Instead, give clients restricted access to operations on the collection through messages that you implement. (Smalltalk Best Practice Patterns, Kent Beck)

Instead, offer methods that provide limited, meaningful access to the information in the collections. (Implementation Patterns, Kent Beck)

Six one way, half a dozen the other.

When SBPP and IP were written, the collection options that were available to both Smalltalk and Java developers were mutable collections. I wrote some quick examples of encapsulating collections in Java using mutable collections. The domain is simple. An Order has zero or more LineItem instances maintained in some collection type.

Order implemented with MutableList protected by Collection Access Methods

Clients do not need to know the type of collection. The Collection could be a List, Set, Stack, Bag or a simple array and the clients wouldn’t know the difference. In this iteration of the problem, I am using a MutableList from Eclipse Collections. All access to the MutableList is protected behind methods exposed on the Order class. LineItem is simply a record class in Java with a String name and double value.

public class Order
{
private final MutableList<LineItem> lineItems = Lists.mutable.empty();

public Order addLineItem(String name, double value)
{
this.lineItems.add(new LineItem(name, value));
return this;
}

public void forEachLineItem(Procedure<LineItem> procedure)
{
this.lineItems.forEach(procedure);
}

public int totalLineItemCount()
{
return this.lineItems.size();
}

public int countOfLineItem(String name)
{
return this.lineItems.count(lineItem -> lineItem.name().equals(name));
}

public double totalOrderValue()
{
return this.lineItems.sumOfDouble(LineItem::value);
}
}

public record LineItem(String name, double value) {}

For this simple use case, where querying the collection of lineItems is limited, this works well. The following test class shows the usage of Order with LineItem and the various accessor methods.

public class OrderTest
{
private final Order order =
new Order().addLineItem("Cup", 5.50)
.addLineItem("Plate", 7.50)
.addLineItem("Fork", 3.00)
.addLineItem("Spoon", 2.50)
.addLineItem("Knife", 3.50);

@Test
public void lineItemCount()
{
Assertions.assertEquals(5, this.order.totalLineItemCount());
}

@Test
public void totalOrderValue()
{
Assertions.assertEquals(22.0, this.order.totalOrderValue());
}

@Test
public void forEachLineItem()
{
StringJoiner join = new StringJoiner(",");
this.order.forEachLineItem(lineItem -> join.add(lineItem.name()));
Assertions.assertEquals(
"Cup,Plate,Fork,Spoon,Knife",
join.toString());
}

@Test
public void countOfLineItem()
{
Assertions.assertEquals(1, this.order.countOfLineItem("Plate"));
Assertions.assertEquals(1, this.order.countOfLineItem("Fork"));
Assertions.assertEquals(1, this.order.countOfLineItem("Spoon"));
Assertions.assertEquals(0, this.order.countOfLineItem("Napkin"));
}
}

There is a downside to this specific implementation of Order I have used with Collection Accessor Method. Order can be mutated, and it is not thread-safe. We could create a synchronized version of Order called SynchronizedOrder, which will protect read and write method calls from stepping on each other. Since we are using Collection Accessor Methods to protect all calls to the contained list, we can simply synchronize each method that accesses it.

public class SynchronizedOrder
{
private final MutableList<LineItem> lineItems = Lists.mutable.empty();

public synchronized SynchronizedOrder addLineItem(String name, double value)
{
this.lineItems.add(new LineItem(name, value));
return this;
}

public synchronized void forEachLineItem(Procedure<LineItem> procedure)
{
this.lineItems.forEach(procedure);
}

public synchronized int totalLineItemCount()
{
return this.lineItems.size();
}

public synchronized int countOfLineItem(String name)
{
return this.lineItems.count(lineItem -> lineItem.name().equals(name));
}

public synchronized double totalOrderValue()
{
return this.lineItems.sumOfDouble(LineItem::value);
}
}

This will start to be overkill as we add more methods. There is also the challenge of testing each method we add for thread-safety, and the possibility that in the future someone might remove or forget to add the synchronized keyword to one of the methods.

We can make Order thread-safe more simply by changing the following definition of lineItems in the Order class.

We can create aSynchronizedList by calling asSynchronized on an empty MutableList.

private final MutableList<LineItem> lineItems = 
Lists.mutable.empty().asSynchronized();

A SynchronizedList will block readers and writers accessing the list. The one place in a SynchronizedList that is not protected is iterator. It is up to clients to synchronize access to the collection manually when using an Iterator. This is not a problem for this particular use case, as we do not use Iterator inside of Order, and we do not expose the lineItems to clients.

Eclipse Collections has another alternative for a thread-safe List. We can use a MultiReaderList.

private final MutableList<LineItem> lineItems = 
Lists.multiReader.empty();

MultiReaderList works as its name implies. Multiple readers can access query methods on the List, and a single writer will block all readers when accessing mutating methods like add or remove.

The nice thing about using Collection Accessor Method is that we can change things in the internal implementation without impacting our clients. This is using encapsulation as it is intended for goodness.

Immutable Collections

We have additional options available in Java via Eclipse Collections, that were not available in open source in 2007 when Implementation Patterns was published. While encapsulating mutable collections via Collection Accessor Method is preferable in all cases that I can think of, there are situations where exposing a “read-only” version of the Collection may reduce the number of collection query methods that have to be provided on a type. There is a tradeoff of protection of a types collection and the flexibility of being able to use high level collection query methods without replicating the entire rich API of a collection library like Eclipse Collections.

If you only have Mutable Collection types, then the Collection Accessor Method pattern is your best option. Exposing collection via unmodifiable wrappers or exposing just Iterator with the remove method forcefully removed is awful at best, and downright dangerous at worst. Given a choice, I would never expose Iterator on a Java Collection anywhere. Iterator is a concurrency nightmare. I always prefer internal iterators like forEach. These can be protected appropriately by the collection type. Iterator cannot be protected by anyone other than the developer who is using it. This requires intimate knowledge on the part of the consumer of a collection and trust on the part of the provider of the collection.

Trust no one. Use contractually and structurally Immutable Collection types. Do not implicitly trust “immutable” collections masquerading unsafely behind mutable interfaces.

If we convert the domain of Order and LineItem to use Immutable Collection types, we can remove the trust factor from the equation. Consumers can trust that the collection won’t change. Providers can trust that clients have no direct way to modify the collections.

Order implemented with an ImmutableList that is made available to clients

The following is an implementation using a Java record to define Order with an ImmutableBag<LineItem> for the collection of lineItems. LineItem is also a Java record.

public record Order(ImmutableBag<LineItem> lineItems)
{
public Order()
{
this(Bags.immutable.empty());
}

public Order addLineItem(String name, double value)
{
return new Order(lineItems.newWith(new LineItem(name, value)));
}

public double totalOrderValue()
{
return this.lineItems.sumOfDouble(LineItem::value);
}
}

public record LineItem(String name, double value) {}

There is an empty constructor here so an Order can be constructed without passing an ImmutableBag<LineItem>. This is for convenience so we can emulate the building of an immutable Order by using an addLineItem method that looks very similar to the method we used with Collection Accessor Method.

The unit test changes as follows with this new definition of Order.

public class OrderTest
{
private final Order order =
new Order().addLineItem("Cup", 5.50)
.addLineItem("Plate", 7.50)
.addLineItem("Fork", 3.00)
.addLineItem("Spoon", 2.50)
.addLineItem("Knife", 3.50);

@Test
public void lineItemCount()
{
assertEquals(5, this.order.lineItems().size());
}

@Test
public void totalOrderValue()
{
assertEquals(22.0, order.totalOrderValue());
}

@Test
public void forEachLineItem()
{
StringJoiner joiner = new StringJoiner(",");
this.order.lineItems()
.toSortedListBy(LineItem::name)
.forEach(lineItem -> joiner.add(lineItem.name()));
assertEquals("Cup,Fork,Knife,Plate,Spoon", joiner.toString());
}

@Test
public void makeStringLineItems()
{
assertEquals(
"Cup,Fork,Knife,Plate,Spoon",
this.order.lineItems()
.toSortedListBy(LineItem::name)
.makeString(LineItem::name, "", ",", ""));
}

@Test
public void countOfLineItem()
{
Bag<String> counts =
this.order.lineItems().countBy(LineItem::name);

assertEquals(1, counts.occurrencesOf("Plate"));
assertEquals(1, counts.occurrencesOf("Fork"));
assertEquals(1, counts.occurrencesOf("Spoon"));
assertEquals(0, counts.occurrencesOf("Napkin"));
}

@Test
public void iteratorRemoveThrowsOnImmutableBag()
{
assertThrows(
UnsupportedOperationException.class,
() -> this.order.lineItems().iterator().remove());
}
}

While the calls to addLineItem looks very similar between the mutable and immutable cases, the results are very different. Each call to addLineItem in the immutable case creates a new Order with a new ImmutableBag<LineItem>. Look closely at the code for addLineItem and you will see the difference. The client code looks very similar.

We can safely expose the lineItems to clients here because ImmutableBag is both contractually and structurally immutable. There are no mutating methods on ImmutableBag like add or remove. The Order in the immutable case is thread-safe because the lineItems collection cannot be modified and cause any race conditions. This reduces the amount of query code that we need to write on Order, and increases the total amount of behavior clients can use when querying the lineItems. I will leave it to the judgement of the reader what makes sense to still provide on Order using Collection Accessor Method.

Readable Collections

Using Collection Accessor Method completely, or exposing explicit access to an ImmutableList<LineItem> are two safe approaches to modeling an object-oriented domain with collections. I prefer the Immutable Collection approach, but there is another solution that sits somewhere between Collection Accessor Method and Immutable Collections which I call Readable Collections.

The design of collection types in Eclipse Collections uses a predicatable triad of interfaces. The interfaces are Mutable, Immutable, and Readable. The Readable interfaces sometimes end in the suffix Iterable, so as not to collide with the mutable collection types defined in java.util like List, Set, Map.

The parent interfaces for all collection types in Eclipse Collections

The type RichIterable here is “Readable”. The type only has read-only methods that do not mutate the collection. The one exception to this is the method remove on Iterator which can be accessed after calling iterator. This is a design tradeoff that was made so RichIterable can be used in Java 5 style foreach loops by extending the java.lang.Iterable interface.

We can use the Collection Accessor Method approach to protect modification through the Order interface, but we can expose a “readable” interface like RichIterable for the lineItems.

Order with a MultiReaderBag for lineItems exposed as RichIterable<LineItem>

The one additional trick is to wrap the lineItems in an Unmodifiable wrapper calling asUnmodifiable. This guarantees that some client will not be able to modify the collection directly by casting the type to a Mutable type. As it turns out, there is no risk of a client calling lineItems().iterator().remove() as MultiReaderCollection types in Eclipse Collections do not implement iterator(), as Iterator instances are thread-dangerous.

The following shows how the Readable approach for lineItems made public as RichIerable<LineItem> is implemented.

public class Order
{
private final MultiReaderBag<LineItem> lineItems =
Bags.multiReader.empty();

public Order addLineItem(String name, double value)
{
this.lineItems.add(new LineItem(name, value));
return this;
}

public RichIterable<LineItem> lineItems()
{
return this.lineItems.asUnmodifiable();
}

public double totalOrderValue()
{
return this.lineItems.sumOfDouble(LineItem::value);
}
}

The following unit test shows how the Order class can be used.

public class OrderTest
{
private final Order order =
new Order().addLineItem("Cup", 5.50)
.addLineItem("Plate", 7.50)
.addLineItem("Fork", 3.00)
.addLineItem("Spoon", 2.50)
.addLineItem("Knife", 3.50);

@Test
public void lineItemCount()
{
assertEquals(5, this.order.lineItems().size());
}

@Test
public void totalOrderValue()
{
assertEquals(22.0, order.totalOrderValue());
}

@Test
public void forEachLineItem()
{
StringJoiner joiner = new StringJoiner(",");
this.order.lineItems()
.toSortedListBy(LineItem::name)
.forEach(lineItem -> joiner.add(lineItem.name()));
assertEquals("Cup,Fork,Knife,Plate,Spoon", joiner.toString());
}

@Test
public void makeStringLineItems()
{
assertEquals(
"Cup,Fork,Knife,Plate,Spoon",
this.order.lineItems()
.toSortedListBy(LineItem::name)
.makeString(LineItem::name, "", ",", ""));
}

@Test
public void countOfLineItem()
{
Bag<String> counts =
this.order.lineItems().countBy(LineItem::name);

assertEquals(1, counts.occurrencesOf("Plate"));
assertEquals(1, counts.occurrencesOf("Fork"));
assertEquals(1, counts.occurrencesOf("Spoon"));
assertEquals(0, counts.occurrencesOf("Napkin"));
}

@Test
public void iteratorThrowsOnMultiReaderBag()
{
assertThrows(UnsupportedOperationException.class,
() -> this.order.lineItems().iterator());
}
}

Final Thoughts

We explored the following three approaches to modeling an OO domain with collections representing relationships between classes.

  • Collection Accessor Method
  • Immutable Collections
  • Readable Collections

The Collection Accessor Method is the safest and most straightforward approach when modeling with Collections. It is also the least flexible. If we need more collection query methods, we have to add more methods to our domain classes providing access to these methods.

Immutable Collections give the best combination of safety and flexibility. If your application data follows a “load data up front for read-only querying” style then this approach works very well. If the data in your application is constantly changing, there is a cost to continually changing the data in immutable collections.

Readable Collections give a good combination of safety and flexibility and are better suited to occasional writes happening to application data throughout the course of the application lifecycle.

There is a fourth way (shhhh!!!)

I don’t know if we’re ready for this yet. What happens if we change the lineItems method in the Readable Collections Order implementation to the following:

public RichIterable<LineItem> lineItems()
{
return this.lineItems.asLazy();
}

The Order tests still pass, with the exception of the iterator throwing on MultiReaderBag. The Iterator throws on remove instead.

Take a look at the class diagram of RichIterable above again, and you will see RichIterable has a subtype named LazyIterable. LazyIterable is not a Collection. LazyIterable is lazy like Stream, but it is not “one and done” like Stream. A LazyIterable can be safely used over and over again, like any Iterable.

I will leave exploring the possibilities of using LazyIterable as an exercise for the reader.

Thanks for reading, and I hope this blog helps you find safer and better ways of modeling your object-oriented domain classes with collections.

Enjoy!

I am the creator of and committer for the Eclipse Collections OSS project, which is managed at the Eclipse Foundation. Eclipse Collections is open for contributions.

--

--

Donald Raab

Java Champion. Creator of the Eclipse Collections OSS Java library (https://github.com/eclipse/eclipse-collections). Inspired by Smalltalk. Opinions are my own.