Lazy and inexhaustible

Donald Raab
6 min readOct 30, 2017

Laziness is a virtue. Sometimes you want it to be repeatable.

The lazy hazy crazy days of summer are gone but will be back again.

In programming, laziness can be a very good thing. Lazy initialization allows us to avoid creating expensive resources until they are needed. Lazy iteration allows us to avoid creating temporary data structures and potentially to reduce the total amount of work needed in order to perform a computation. Two built-in paradigms existed for lazy iteration in Java before Java 8. They are Iterable and Iterator. An Iterable can be used over and over again, as it can continue to give you a brand new Iterator. An Iterator can only be used once, as there is no way to reset it once you’ve gotten to the last element via next().

In Java 8, Streams were added with methods that are lazy (e.g. map, filter, etc.). Stream is like an Iterator, in that it can only be used once with a terminal operation like forEach or collect. This means you have to be careful not to exhaust a Stream and then try and use it again.

This is what will happen at runtime if you try and use a Stream more than once.

java.lang.IllegalStateException: stream has already been operated upon or closed

I am going to use the getAgeStatisticsOfPets test in Exercise 4 of the Eclipse Collections Pet Kata to illustrate how you can work with a Stream without getting an IllegalStateException. I will also show you some other alternatives that are lazy using Eclipse Collections.

First, here’s the code I would like to write for the test in Exercise 4 of the Pet Kata. I am using an IntStream (obtained via mapToInt) in order to avoid boxing int as Integer. This code compiles but will fail upon execution.

@Test
public void getAgeStatisticsOfPets()
{
IntStream petAges = this.people
.stream()
.flatMap(person -> person.getPets().stream())
.mapToInt(Pet::getAge);
Set<Integer> uniqueAges =
petAges.boxed().collect(Collectors.toSet());
IntSummaryStatistics stats = petAges.summaryStatistics(); Assert.assertEquals(Sets.mutable.with(1, 2, 3, 4), uniqueAges);
Assert.assertEquals(stats.getMin(), petAges.min().getAsInt());
Assert.assertEquals(stats.getMax(), petAges.max().getAsInt());
Assert.assertEquals(stats.getSum(), petAges.sum());
Assert.assertEquals(stats.getAverage(),
petAges.average().getAsDouble(), 0.0);
Assert.assertEquals(stats.getCount(), petAges.count());
Assert.assertTrue(petAges.allMatch(i -> i > 0));
Assert.assertFalse(petAges.anyMatch(i -> i == 0));
Assert.assertTrue(petAges.noneMatch(i -> i < 0));
}

The code will run until this line attempts to execute.

IntSummaryStatistics stats = petAges.summaryStatistics();

That’s when the IllegalStateException is thrown. The call to collect in the previous line caused the Stream to become exhausted.

One option I have to make the code work is to pre-calculate the pets as a flattened List and then recreate the IntStream for the ages as I need them.

@Test
public void getAgeStatisticsOfPets()
{
List<Pet> petAges = this.people
.stream()
.flatMap(person -> person.getPets().stream())
.collect(Collectors.toList());
Set<Integer> uniqueAges =
petAges.stream()
.mapToInt(Pet::getAge)
.boxed()
.collect(Collectors.toSet());
IntSummaryStatistics stats =
petAges.stream()
.mapToInt(Pet::getAge)
.summaryStatistics();
Assert.assertEquals(Sets.mutable.with(1, 2, 3, 4), uniqueAges);
Assert.assertEquals(stats.getMin(), petAges.stream()
.mapToInt(Pet::getAge).min().getAsInt());
Assert.assertEquals(stats.getMax(), petAges.stream()
.mapToInt(Pet::getAge).max().getAsInt());
Assert.assertEquals(stats.getSum(), petAges.stream()
.mapToInt(Pet::getAge).sum());
Assert.assertEquals(stats.getAverage(), petAges.stream()
.mapToInt(Pet::getAge).average().getAsDouble(),
0.0);
Assert.assertEquals(stats.getCount(), petAges.size());
Assert.assertTrue(
petAges.stream()
.mapToInt(Pet::getAge)
.allMatch(i -> i > 0));
Assert.assertFalse(
petAges.stream()
.mapToInt(Pet::getAge)
.anyMatch(i -> i == 0));
Assert.assertTrue(
petAges.stream()
.mapToInt(Pet::getAge)
.noneMatch(i -> i < 0));
}

This works but I had to write a lot of duplicate code. I have to call this code over and over again to recreate the IntStream of pet ages.

petAges.stream().mapToInt(Pet::getAge)

Since I do not like duplicating code, I want to find a solution for this problem. One solution would be to put this duplicate code in a Supplier and calculate it on demand by calling the get() method on the Supplier.

@Test
public void getAgeStatisticsOfPets()
{
List<Pet> pets = this.people
.stream()
.flatMap(person -> person.getPets().stream())
.collect(Collectors.toList());
Supplier<IntStream> petAges =
() -> pets.stream().mapToInt(Pet::getAge);
Set<Integer> uniqueAges =
petAges.get().boxed().collect(Collectors.toSet());
IntSummaryStatistics stats =
petAges.get().summaryStatistics();
Assert.assertEquals(Sets.mutable.with(1, 2, 3, 4), uniqueAges);
Assert.assertEquals(stats.getMin(),
petAges.get().min().getAsInt());
Assert.assertEquals(stats.getMax(),
petAges.get().max().getAsInt());
Assert.assertEquals(stats.getSum(),
petAges.get().sum());
Assert.assertEquals(stats.getAverage(),
petAges.get().average().getAsDouble(),
0.0);
Assert.assertEquals(stats.getCount(),
petAges.get().count());
Assert.assertTrue(petAges.get().allMatch(i -> i > 0));
Assert.assertFalse(petAges.get().anyMatch(i -> i == 0));
Assert.assertTrue(petAges.get().noneMatch(i -> i < 0));
}

This reduces the amount of duplicate code I had to write. I can go one step further and make the flatCollect not have to collect into a List, by having the Supplier do more of the work.

@Test
public void getAgeStatisticsOfPets()
{
Supplier<IntStream> petAges =
() -> this.people
.stream()
.flatMap(person -> person.getPets().stream())
.mapToInt(Pet::getAge);
Set<Integer> uniqueAges =
petAges.get().boxed().collect(Collectors.toSet());
IntSummaryStatistics stats =
petAges.get().summaryStatistics();
Assert.assertEquals(Sets.mutable.with(1, 2, 3, 4), uniqueAges);
Assert.assertEquals(stats.getMin(),
petAges.get().min().getAsInt());
Assert.assertEquals(stats.getMax(),
petAges.get().max().getAsInt());
Assert.assertEquals(stats.getSum(),
petAges.get().sum());
Assert.assertEquals(stats.getAverage(),
petAges.get().average().getAsDouble(),
0.0);
Assert.assertEquals(stats.getCount(),
petAges.get().count());
Assert.assertTrue(petAges.get().allMatch(i -> i > 0));
Assert.assertFalse(petAges.get().anyMatch(i -> i == 0));
Assert.assertTrue(petAges.get().noneMatch(i -> i < 0));
}

This almost feels like creating a lazy Iterable, where each time we need to do something, we create an Iterator to perform an additional function. In Eclipse Collections, there is a LazyIterable type, that can be created from any RichIterable. A LazyIterable can be used safely as many times as you want. It may be expensive to recalculate the functions over and over again, but it will allow you to do so and will not become exhausted after the first time you use it.

The following shows how you can solve this problem using a LazyIntIterable with Eclipse Collections.

@Test
public void getAgeStatisticsOfPets()
{
LazyIntIterable petAges = this.people
.asLazy()
.flatCollect(Person::getPets)
.collectInt(Pet::getAge);
IntSet uniqueAges = petAges.toSet(); IntSummaryStatistics stats = petAges.summaryStatistics(); Assert.assertEquals(
IntSets.mutable.with(1, 2, 3, 4),
uniqueAges);
Assert.assertEquals(stats.getMin(), petAges.min());
Assert.assertEquals(stats.getMax(), petAges.max());
Assert.assertEquals(stats.getSum(), petAges.sum());
Assert.assertEquals(stats.getAverage(), petAges.average(), 0.0);
Assert.assertEquals(stats.getCount(), petAges.size());
Assert.assertTrue(petAges.allSatisfy(i -> i > 0));
Assert.assertFalse(petAges.anySatisfy(i -> i == 0));
Assert.assertTrue(petAges.noneSatisfy(i -> i < 0));
}

Once I have a LazyIntIterable, I do not need to box the unique ages into a Set of Integer. I can instead store them in an IntSet as I have above, simply by calling toSet() on the LazyIntIterable.

Because LazyIntIterable is lazy, it does not pre-calculate and store the pet ages. It has to execute the flatCollect() and collectInt() each time you call a terminal method like toSet, summaryStatistics, min, max, sum, average, size, any/all/noneSatisfy. If I want the code to be more efficient, I can pre-calculate the pet ages and store them in an IntList or IntBag. I will use an IntBag here, as there are duplicate ages but order doesn’t matter.

@Test
public void getAgeStatisticsOfPets()
{
IntBag petAges = this.people
.asLazy()
.flatCollect(Person::getPets)
.collectInt(Pet::getAge)
.toBag();
IntSet uniqueAges = petAges.toSet(); IntSummaryStatistics stats = petAges.summaryStatistics(); Assert.assertEquals(
IntSets.mutable.with(1, 2, 3, 4),
uniqueAges);
Assert.assertEquals(stats.getMin(), petAges.min());
Assert.assertEquals(stats.getMax(), petAges.max());
Assert.assertEquals(stats.getSum(), petAges.sum());
Assert.assertEquals(stats.getAverage(), petAges.average(), 0.0);
Assert.assertEquals(stats.getCount(), petAges.size());
Assert.assertTrue(petAges.allSatisfy(i -> i > 0));
Assert.assertFalse(petAges.anySatisfy(i -> i == 0));
Assert.assertTrue(petAges.noneSatisfy(i -> i < 0));
}

All I had to change in the code to make this work was to call the method toBag() after calling collectInt() and change the type of petAges from LazyIntIterable to IntBag. No other code needed to change. This is because our primitive collections and primitive lazy iterables in Eclipse Collections have good symmetry. Notice how there is no boxing of int to Integer objects in either the LazyIntIterable or IntBag solution.

I can easily change the type from IntBag to IntList, just by changing the toBag() method call to toList().

@Test
public void getAgeStatisticsOfPets()
{
IntList petAges = this.people.asLazy()
.flatCollect(Person::getPets)
.collectInt(Pet::getAge)
.toList();
IntSet uniqueAges = petAges.toSet(); IntSummaryStatistics stats = petAges.summaryStatistics(); Assert.assertEquals(
IntSets.mutable.with(1, 2, 3, 4),
uniqueAges);
Assert.assertEquals(stats.getMin(), petAges.min());
Assert.assertEquals(stats.getMax(), petAges.max());
Assert.assertEquals(stats.getSum(), petAges.sum());
Assert.assertEquals(stats.getAverage(), petAges.average(), 0.0);
Assert.assertEquals(stats.getCount(), petAges.size());
Assert.assertTrue(petAges.allSatisfy(i -> i > 0));
Assert.assertFalse(petAges.anySatisfy(i -> i == 0));
Assert.assertTrue(petAges.noneSatisfy(i -> i < 0));
}

Once again, nothing else needs to change.

When you call min, max and average on an IntStream, you will get an OptionalInt or OptionalDouble. This is a good thing if you have the potential to have an empty result. OptionalInt and OptionalDouble will allow you to handle the cases where the result is empty. With Eclipse Collections, there is a different option for these three methods to help in the case where the Iterable or Collection is empty.

Assert.assertEquals(stats.getMin(), petAges.minIfEmpty(0));
Assert.assertEquals(stats.getMax(), petAges.maxIfEmpty(0));
Assert.assertEquals(stats.getSum(), petAges.sum());
Assert.assertEquals(stats.getAverage(), petAges.averageIfEmpty(0.0), 0.0);

The methods minIfEmpty, maxIfEmpty and averageIfEmpty allow you to specify a default value to use in the case of an empty result. In the future, we may also add minOptional, maxOptional and averageOptional if there is a need for them.

If you use Streams, and want them to be re-usable, then consider using them in conjunction with a Supplier. This will reduce the amount of duplicate code you will have to write. If you want inexhaustible laziness out of the box, then consider using Eclipse Collections, as you will get a lot of additional options that you can use in addition to Streams.

I hope this blog was useful and informative and showed some options for using Streams and Eclipse Collections LazyIterables effectively to solve the same problems. I also hope that you try out the Eclipse Collections katas on your own. I often teach the katas using both Streams and Eclipse Collections so developers can learn both APIs and understand what options they have available to them.

I am a Project Lead and Committer for the Eclipse Collections OSS project at the Eclipse Foundation. Eclipse Collections is open for contributions. If you like the library, you can let us know by starring it on GitHub.

--

--

Donald Raab

Java Champion. Creator of the Eclipse Collections OSS Java library (https://github.com/eclipse/eclipse-collections). Inspired by Smalltalk. Opinions are my own.