Level Up Coding

Coding tutorials and news. The developer homepage gitconnected.com && skilled.dev && levelup.dev

Follow publication

Map-Oriented Programming in Java

--

Using MOP may be convenient sometimes, but it can also be messy.

Photo by Photos of Korea on Unsplash

To the Batpoles!

I created polls in Twitter(!X) and LinkedIn asking if developers use a Bag/Multiset type or if they just use Java Map. Unsurprisingly, java.util.Map is dominating both polls.

(Apologies for anyone who hasn’t seen the 1960s TV version of Batman… the Batpoles were two poles hidden behind a sliding bookcase and Batman and Robin would use them to slide down to the Batcave.)

The question is not meant to determine whether Bag or Map is a better type. Both interfaces serve different purposes, and have different behaviors. A Map associates keys to values. A Bag is an unordered non-unique collection that makes it easy to track counts of things, and is usually backed by a Map to speed up lookups. A Map can be used to keep track of counts, the same as a hammer can be used to punch a hole through concrete instead of using a jackhammer. No one I know would argue that a hammer is a better tool to punch a hole through concrete, but if you don’t have a jackhammer on hand, you whack the heck out of that concrete with the hammer that you do have.

When your only tool is a Map, everything else is a key-value pair

Anyway, back to the poll and the ultimate question… There are three libraries named in the poll that provide a Bag/Multiset type in Java — Google Guava, Apache Commons Collections, and Eclipse Collections. Then there is Map from Java. The question is really are you willing to take a third-party dependency in your application to get a Bag/Multiset type, or are you happy to stick with Map-Oriented Programming (MOP)? Most developers these days try to limit their third-party dependencies for a variety of reasons (binary size, version conflict resolution, potential vulnerabilities, etc.). This leaves most developers in a position to just leverage the Map-Oriented Programming alternatives sometimes provided by the JDK, or alternatively build their own Bag/Multiset solution. Most developers I know, go with the Map-Oriented Programming solution.

The following quote is the essence of Map-Oriented Programming (MOP).

We need to get $h!t done now! We’ve got a Map and we’re not afraid to use it!

The problem with MOP is that while Map is super flexible with the data it can contain in the key and value slots is provides, it’s behavior stays the same. It’s just a Map. You can put things in, and get things out… including null or any other random type. Over the years, the Java Map interface has added new Map specific behavior that enhances the flexibility like being able to merge elements, compute them or get a default value if a key is absent. More specialized behavior, like counting or adding to/removing from a collection in one of the value slots, which is not part of the Map contract leaks out into your code, or into algorithms tacked on to Stream and Collector. You lose the ability to have types and structures that provide augmented behavior on top of the Map.

I’ve enjoyed the benefits of using both dynamic type systems and static type systems having worked professionally in both Smalltalk and Java. Data structures like Map sometimes give you the feeling of the benefits of a dynamic type system, without the benefits of the static type system. I like having a static type system for a lot of reasons, even though it sometimes slows me down when I am developing on my own. I’m not a fan of Map-Oriented Programming, but I will confess to having used it on occasion when it provided a short-term convenience where adding new types was a hassle. When I discover the need for a new type, I usually add it. This is sometimes the hard path, but it is usually the right path. There is a cost for every new type we add to our applications, but there are also the benefits of communication, clarity, encapsulation, reduction in code duplication, increased safety, and improved performance.

Shortcuts aren’t

There are three types the JDK is missing, that have been represented with Map as an alternative. Using Map is leveraging existing type flexibility to avoid cost, in this case for the framework developers. Any incurred cost is moved instead to the application developers that use Map as a return type. The following Map return types used in Collectors can never be changed. These Map return types were introduced on Collectors when Java 8 was released.

// Map<Boolean, List<T>> -> Pair<T, T>
Collectors.partitioningBy()

// Map<T, Long> -> Bag<T>
Collectors.groupingBy(Collectors.counting)

// Map<K, Collection<V>> -> Multimap<K, V>
Collectors.groupingBy()

Pair, Bag, and Multimap are just some of the types missing from the JDK. We can call Pair something more specific in the case of partitioningBy like Partition, but it’s still essentially a Pair of two things of the same type.

We don’t need a Pair type!

A well-reasoned decision was made to not add a generic Pair type or support for generic tuples in Java. Instead, the use of named types created via Java Records are recommended for Java developers since Java 16 was released. This is a reasoned decision, that I fully support, even if the open source framework I created (Eclipse Collections) has Pair and Triple types. I appreciate creating specialized types for things, and being able to do so with very little ceremony using Java Records is awesome.

Please fasten your seatbelts for what comes next.

We’ve got a Map!

Using a Map as a generic Pair type is arguably worse than adding a generic Pair type. How do you use a Map as a Pair? There is an example in the Stream and Collectors code in the JDK with partitioningBy.

Let’s look at the following example of partitioningBy, where we will one pass filter a Stream of Integer into separate List instances of evens and odds.

@Test
public void partitioningBy()
{
Map<Boolean, List<Integer>> map =
IntStream.rangeClosed(1, 10)
.boxed()
.collect(Collectors.partitioningBy(each -> each % 2 == 0));

List<Integer> evens = map.get(true);
List<Integer> odds = map.get(false);
List<Integer> ummm = map.get(null);
List<Integer> ohno = map.get(new Object());

Assertions.assertEquals(List.of(2, 4, 6, 8, 10), evens);
Assertions.assertEquals(List.of(1, 3, 5, 7, 9), odds);
Assertions.assertNull(ummm);
Assertions.assertNull(ohno);

ummm = map.getOrDefault(null, evens);
Assertions.assertEquals(List.of(2, 4, 6, 8, 10), ummm);

ohno = map.getOrDefault(new Object(), odds);
Assertions.assertEquals(List.of(1, 3, 5, 7, 9), ohno);
}

This code takes a Stream of Integer from 1 to 10 and filters the even values into one List and the odd values into another List using partitioningBy. The result is a Map<Boolean, List<Integer>>. The true values in the Map are the ones that filter inclusively. The false values in the Map are the ones that filter exclusively. The null values in the Map are the ones that… wait, why are there null values in the Map? Why is there a new Object() lookup in a Map<Boolean, List<Integer>>? What is happening here!?! Recall that Map existed before Java 5 when generics were added to Java, and the get method on Map is not generic and accepts any Object. Ahhhh… Map.

If you’ve never dug into the result of partitioningBy before, it returns an instance of type named Partition that is a an inner class in Collectors. I knew the partitioningBy method returned a Map<Boolean, List<Type>>, but wasn’t aware of the actual implementation until I looked today. The Partition type is immutable, but still behaves like a Map as I illustrate above. The get method on Map is not generic, so accepts ANY object of ANY type. The Partition class does not throw on non-boolean access via get, but instead returns null. Lookups with potentially any type will result in null. The method getOrDefault or any of the other read-only Map methods behave consistently with other Map types. Mutable methods like put throw exceptions.

What about using a primitive BooleanObjectMap?

For those wondering if I would propose using a primitive version of a BooleanObjectMap from Eclipse Collections to solve the generic get problem with Map… I wouldn’t, and I can’t. A BooleanObjectMap type doesn’t exist in Eclipse Collections. When we designed the primitive Map hierarchy, we made a conscious decision to remove ALL combinations of primitive maps with boolean as a key. There are no boolean to anything maps in Eclipse Collections.

Why?

Having a Map of boolean to anything had a design smell of using Map as a hammer. If you need two values, one for true and one for false, then use two variables to hold the values and put those values in a specific type. The variables in this new type can have intention revealing names (e.g. selected and rejected, in Eclipse Collections PartitionIterable) instead of less meaningful names like ifTrue and ifFalse, or Boolean values in a Map. If you want to pass these values around together in a single generic instance of something, because you can’t or don’t want to add a new type, then use a generic Pair. Buyer beware. You will get less meaningful names for your contained values with a Pair as well (one and two, or left and right).

What if we used an Enum for the key type instead of Boolean?

Another option using a Map to represent a pair of the same type would be to use an Enum for the key, where the names in the Enum have intention revealing names (eg. Filter.SELECTED, Filter.REJECTED). Then we could write map.get(Filter.SELECTED) instead of map.get(true).

Why not?

This solution requires a new type of Enum to be created to hold these key names. If we already have to add a new type, and it would just be better to define the specific type we need with the named variables and types. (e.g. Partition type with the selected and rejected variables). The better names in the Enum also wouldn’t solve the generic problems with the get method on Map. In fact, you could still write map.get(true) and it would return null.

Stop Hammer time!

I think it is better for the JDK to leverage its static typing benefits and return specific types instead of returning Map, whenever possible. I think returning the Partition type for partitioningBy would have made more sense to return instead of a Map. This would have meant exposing a new public type. The Partition type is private static. The new public type wouldn’t need to be a completely generic type like Pair. Eclipse Collections partition method on RichIterable returns a PartitionIterable type. There is a cost to adding/maintaining this type and all of its subtypes that is handled by the Eclipse Collections developers. This gives developers who use the library the safest most specific alternative at various levels in the type hierarchy.

@Test
public void partition()
{
PartitionMutableList<Integer> partition =
Interval.oneTo(10)
.partition(each -> each % 2 == 0);

MutableList<Integer> selected = partition.getSelected();
MutableList<Integer> rejected = partition.getRejected();

Assertions.assertEquals(List.of(2, 4, 6, 8, 10), selected);
Assertions.assertEquals(List.of(1, 3, 5, 7, 9), rejected);
}

There are two other places where Collectors returns Map as a type that would have been better off as more specific types. The problem is convenience and cost. It was more convenient in the Java 8 release to return a Map, instead of a Bag or Multimap , because it would have meant introducing Bag and Multimap types and implementations, which would have delayed the Java 8 release, potentially significantly. Having seen these types created in Eclipse Collections many years ago, I can confirm they are both expensive to build and to test. Unfortunately, we are stuck with the decision to go with convenience and return type of Map forever in Collectors.

I have blogged previously about Map vs. Bag and Map vs. Multimap. Read the blogs at the links below if you would like to understand more.

For some other examples and details of partition in Eclipse Collections, there is a blog for that below. The most interesting thing for some developers in this blog may be the PartitionIterable hierarchy that was implemented to support the covariant nature of the partition method.

Whither Map-Oriented Programming

Map is a hammer. It’s a very useful and convenient tool, but we fall back on Map as an all purpose tool and flexible return type too much. Java Records give us a new level of convenience with the benefit of static typing. Additional Collection types like Bag and Multimap augment the capabilities of Map with different specialized behaviors for developers to leverage.

In the data-oriented programming space, I prefer solutions like Dataframe libraries which are much more specific than row-based Collection of Map about their features and purpose. I think Java Record gives a nice low-ceremony alternative for creating Collection of Record types that can be statically type checked. This helps provide type safety, memory efficiency, and performance.

I hope to see additional Collection types incorporated in the JDK, instead of continuing to use Map as a convenient but messy alternative. I believe the JDK should have Partition, Bag, and Multimap types. Partition is already there as an implementation. Partition just needs to stop pretending to be a Map and made public or represented by a more specific and constrained interface. Unfortunately, since partitioningBy already returns a Map, this method will not likely ever be changed, but could be instead deprecated and replaced by an alternative with a better return type.

I hope this blog made you think about the cost/benefit of using Map as an all purpose return type. My recommendation — don’t! Use Map as a return type only when it is the absolute best option available for a method. If another type would be a better option as a return type, then create it or use it if it already exists.

Thank you for reading!

I am the creator of and committer for the Eclipse Collections OSS project, which is managed at the Eclipse Foundation. Eclipse Collections is open for contributions.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Written by Donald Raab

Java Champion. Creator of the Eclipse Collections OSS Java library (https://github.com/eclipse-collections). Inspired by Smalltalk. Opinions are my own.

Responses (2)

Write a response