Spliterating Hairs Results in Spliterating Deja Vu
How a “Random” question led me down a Java Spliterator rabbit hole.
A Better RandomAccess Default From Long Ago
I responded to a comment on a recent blog with a link to a blog that I had a written a few years ago. The blog details the journey that I went on taking a discovery and idea all the way to be included in OpenJDK. This blog and the links it contains are the only organized body of evidence that I am aware of that explain the creation, existence and purpose of RandomAccessSpliterator. The following is the blog.
This story happened a long time ago, but it got me wondering, whatever happened with RandomAccessSpliterator? I mean I knew this class will probably be in Java forever, but I wondered how often it actually gets used today.
Note for the reader: This is the entrance to the rabbit hole. I owe losses of large amounts of personal time to questions like these. If you value your time and accept the universe as it is, then do not ask yourself questions like these. If you do find yourself asking these kinds of questions, you may learn more than you wanted to know.
Find Usages?
One does not simply create RandomAccessSpliterator. It is a default Spliterator, created for RandomAccess List types that have not defined their own spliterator() method. This means, instances of this type can only be discovered at runtime. As evidence, I tried IntelliJ Find Usages on RandomAccessSpliterator on the Eclipse Collections code base. The only place it is created is in a default method on the List interface in the JDK.
While I’ve known I cannot just find usages of this type, I thought there must be another way. So I decided to run the ~167K unit tests in the Eclipse Collections test suite and turn a breakpoint on. Then I learned something I had never tried before. You don’t have to suspend a breakpoint. You can just output a message when the breakpoint is hit. Woo hoo! Runtime usages!
Now when I run the Eclipse Collections unit test suite, this is what I see in the console with this breakpoint set.
Now I just Googled to see if I could find a count of the number of times, and StackOverflow had a question and answer. So I found the tab in IntelliJ, and sure enough there’s the same count I did by hand a day earlier. 🤦♂️
Ok, this is where the story should end. You learned some cool stuff about IntelliJ and debugging and counting breakpoints, win!
I wonder if RandomAccessSpliterator is used anywhere in the JDK code base?
Note to reader: This is where I lose sight of the top of the rabbit hole and enter into free fall into the JDK code base and hours of running JMH Benchmarks.
The Ballad of List12 and ListN
I don’t have the code for running the JDK unit tests on my machine. I’ve never run them, and not sure how I would. That’s a rabbit hole for me to fall down a different day maybe when I’m retired.
I decided to just poke around and try things out with JDK types. I’ll make the story short. I discovered that RandomAccessSpliterator is used by the two classes created by calling List.of(). Yes, the immutable (or unmodifiable if you prefer) lists we’ve been creating since they were added to Java 9, use RandomAccessSpliterator, which was also added in Java 9. Instances of List.of() get RandomAccessSpliterator by default because they don’t define a spliterator() method.
Oh, shaving cream! RandomAccessSpliterator lives!
Now, this rabbit hole, had a little detour. As it turns out, List12 did not define a spliterator() method in Java 21, but it defines one in Java 25. So it must have been added somewhere between Java 21 and 25. The method looks like this in Java 25.
I wrote a test to show a bunch of Spliterator types used by commonly used List types in Java.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;
public class SpliteratorTest
{
@Test
public void listNSpliteratorType()
{
List<Integer> integers = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
assertEquals(
"RandomAccessSpliterator",
integers.spliterator().getClass().getSimpleName());
}
@Test
public void list12SpliteratorType()
{
List<Integer> list1Of12 = List.of(1);
assertEquals(
"",
list1Of12.spliterator().getClass().getSimpleName());
List<Integer> list2Of12 = List.of(1, 2);
assertEquals(
"RandomAccessSpliterator",
list2Of12.spliterator().getClass().getSimpleName());
}
@Test
public void ArraysAsListSpliteratorType()
{
List<Integer> arraysAsList =
Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
assertEquals(
"ArraySpliterator",
arraysAsList.spliterator().getClass().getSimpleName());
}
@Test
public void ArrayListSpliteratorType()
{
List<Integer> arrayList =
new ArrayList<>(List.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10));
assertEquals(
"ArrayListSpliterator",
arrayList.spliterator().getClass().getSimpleName());
}
}Now the simple name of the Spliterator used for single element instances of the List12 type is… empty. This is because it is defined as an anonymous inner class. Anonymous indeed!
I was initially surprised to see there are three named Spliterator types here, not two. I was expecting ArrayList to use ArraySpliterator. The reason it does not, is because the Spliterator for ArrayList has to deal with potential ConcurrentModificationException exceptions being thrown if the modCount variable inherited from AbstractList is triggered. RandomAccessSpliterator has a dual mode which checks if a RandomAccess type extends from AbstractList, and if so adds the modCount logic. If not it ignores it.
Ok, so while I think it is really cool to see an idea I had for a default Spliterator implementation 12 years ago is actively used by newer collection types, I found myself asking the question.
Why would ListN use RandomAccessSpliterator instead of ArraySplitterator, since it doesn’t extend AbstractList and doesn’t need to deal with modCount? List12 makes more sense to me, as it is not backed by an array, and is still RandomAccess.
Note to the reader: This is when I saw the same black cat walk by my doorway twice. This is also where the rabbit hole got very deep, as it caused me to spend 8–10 hours just writing and running JMH benchmarks. I am going to keep it short and sweet for you and just share one straightforward benchmark to consider.
Deja Vu
I’ve been here before. Twelve years ago I found myself discovering and proving that IteratorSpliterator was a terrible default for RandomAccess types when using parallelStream(). RandomAccessSpliterator eventually stepped in as a much better default implementation for folks who could not or did not want to provide their own spliterator() override.
While RandomAccessSpliterator is a good default alternative, I believe that ArraySpliterator must be a better performing alternative for a List type with a backing array that is immutable. The array and immutable aspects are the key. All of the complicated logic needed to check for modCount changes goes away. And with an array, we don’t have to use method calls like get() to look up elements at indexes. Win!
This must be measurable, right? I believe so. To the rabbit hole!
After much testing and trying different combinations of things, I decided I would settle on one benchmark to share. If folks want to try out their own benchmarks to prove if this is worth it or not, I wish them luck and much patience. I am satisfied, and believe that having ListN be as fast to iterate with stream() and parallelStream() as Arrays.asList() would be a good thing, and beneficial to the entire Java community.
Note: We have used ArraySpliterator for all of our array backed List types for years in Eclipse Collections. This is for two reasons. One, it’s a simple and fast Spliterator that we didn’t have to write. Two, we don’t use modCount in our collection types, so don’t require the use of modCount sensitive Spliterators.
The benchmark Results
I ran benchmarks calculating the combination of min and max using Stream.reduce(). I ran these benchmarks on my MacBook Pro M2 Max with 12 Cores and 96GB RAM.
Result output (minus the prefix long test name):
Benchmark Mode Cnt Score Error Units
minMaxArrayList thrpt 20 158.375 ± 3.787 ops/s
minMaxArraysAsList thrpt 20 208.214 ± 0.686 ops/s
minMaxListN thrpt 20 97.352 ± 0.717 ops/s
parallelMinMaxArrayList thrpt 20 1149.748 ± 60.342 ops/s
parallelMinMaxArraysAsList thrpt 20 1387.468 ± 8.229 ops/s
parallelMinMaxListN thrpt 20 1062.055 ± 15.685 ops/sUnit is operations per second so bigger is better.
The Code:
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.TimeUnit;
import org.eclipse.collections.api.tuple.primitive.IntIntPair;
import org.eclipse.collections.impl.list.Interval;
import org.eclipse.collections.impl.tuple.primitive.PrimitiveTuples;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
@State(Scope.Thread)
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.SECONDS)
@Fork(2)
@Warmup(iterations = 20, time = 2)
@Measurement(iterations = 10, time = 2)
public class RandomAccessVsArraySpliteratorBenchmark
{
private final Interval interval = Interval.oneTo(1_000_000);
private final List<Integer> listN = List.copyOf(interval);
private final List<Integer> arrayList = new ArrayList<>(interval);
private final List<Integer> arraysAsList = Arrays.asList(interval.toArray());
@Benchmark
public IntIntPair minMaxListN()
{
int min = this.listN.stream()
.reduce(Math::min)
.orElse(0);
int max = this.listN.stream()
.reduce(Math::max)
.orElse(0);
return PrimitiveTuples.pair(min, max);
}
@Benchmark
public IntIntPair minMaxArrayList()
{
int min = this.arrayList.stream()
.reduce(Math::min)
.orElse(0);
int max = this.arrayList.stream()
.reduce(Math::max)
.orElse(0);
return PrimitiveTuples.pair(min, max);
}
@Benchmark
public IntIntPair minMaxArraysAsList()
{
int min = this.arraysAsList.stream()
.reduce(Math::min)
.orElse(0);
int max = this.arraysAsList.stream()
.reduce(Math::max)
.orElse(0);
return PrimitiveTuples.pair(min, max);
}
@Benchmark
public IntIntPair parallelMinMaxListN()
{
int min = this.listN.parallelStream()
.reduce(Math::min)
.orElse(0);
int max = this.listN.parallelStream()
.reduce(Math::max)
.orElse(0);
return PrimitiveTuples.pair(min, max);
}
@Benchmark
public IntIntPair parallelMinMaxArrayList()
{
int min = this.arrayList.parallelStream()
.reduce(Math::min)
.orElse(0);
int max = this.arrayList.parallelStream()
.reduce(Math::max)
.orElse(0);
return PrimitiveTuples.pair(min, max);
}
@Benchmark
public IntIntPair parallelMinMaxArraysAsList()
{
int min = this.arraysAsList.parallelStream()
.reduce(Math::min)
.orElse(0);
int max = this.arraysAsList.parallelStream()
.reduce(Math::max)
.orElse(0);
return PrimitiveTuples.pair(min, max);
}
}This is the limit I am willing to spend any more time on this. I think it shows that ListN could get a decent speedup for this test case by switching to ArraySpliterator. Other test cases may see different results. I’m fairly confident that switching ListN to use ArraySpliterator will not likely result in any degradation of performance, but I have also learned that measuring performance is really hard, especially when it comes to JIT compilers.
Final Thoughts
I learned some new things trying these experiments and diving into these all too familiar looking rabbit holes. I don’t know if this will cause any changes in Java, but I do hope it helps shine a light on a potentially useful performance improvement moving ListN from using RandomAccessSpliterator to using ArraySpliterator.
I also hope this provides some useful information that my readers may not have been aware of.
Thanks for reading!
I am the creator of and committer for the Eclipse Collections OSS project, which is managed at the Eclipse Foundation. Eclipse Collections is open for contributions. I am the author of the book, Eclipse Collections Categorically: Level up your programming game.
