Traveling the road from Idea all the way to OpenJDK
See something. Say something. Prove something. Do something.
Starting with an Idea
In May 2013, I was in traveling on business in London. The GS Collections library had been released to open source on GitHub a year and a half earlier (January 2012). I was a member of the JSR 335 Expert Group, and would post in the lambda-lib-spec-experts mail list when I wanted to share an opinion on a topic being discussed.
While I was in London, I was actively testing GS Collections with the Java Stream library using a binary snapshot of the pre-Java 8 release. I recall spending several hours in my hotel room, across from St. Paul’s Cathedral, puzzling over a peculiar performance difference I was seeing between
parallelStream when used with GS Collections’
The Java Microbenchmark Harness project (JMH) did not yet exist in open source at this point in time. The first publicly available version of JMH that I could see in GitHub was tagged as a release in November, 2013. I had written and used my own microbenchmark harness in Java for several years. The tool was fairly simple, but I often questioned the accuracy of results. The differences I was seeing between
FastList were so stark and surprising, I believed I had discovered something that needed to be addressed.
I knew that access to
Stream would be introduced in Java as default methods named
parallelStream() on the
java.util.Collection interface. Any interface or class that extended or implemented
java.util.Collection would inherit the default implementation of the
parallelStream() methods. Both
parallelStream depended on another default method named
spliterator. I knew that
FastList and the other mutable containers in GS Collections would immediately be able to leverage Java Streams because they inherited from
I ran some benchmarks comparing
parallelStream with eager iteration methods provided by GS Collections for both
ArrayList. The benchmarks were measuring the average time in milliseconds for each operation, so smaller is better in the charts.
Here are the performance numbers I saw for GS Collections’
Here are the performance numbers I saw for
Eager filtering in GS Collections (green bars) performed better than
parallelStream (blue bars) filtering in all cases. This met my expectations, as eager methods will almost always outperform for a single operation like filter. The difference in performance between
parallelStream filtering with
parallelStream filtering with
ArrayList is the thing that confused and concerned me. I expected
FastList performance to be very similar to
The benchmarks were run using pre-release Java 8 binaries on a dual core Windows machine. Seeing close to a 2x performance increase for the parallel cases is what I would have hoped for with filtering. The
parallelStream case for
FastList, ran slower than the
stream (serial) case. This made no sense to me, so I went and looked at the code for
parallelStream to try and figure out why I was seeing this unexpected difference.
The problem seemed to center around the default implementation of
Collection that was returning an
ArrayList returns a specialized
spliterator implementation called
ArrayListSpliterator. There was no
spliterator provided for non-JDK
List implementations, like
FastList. I would propose in the following thread that a
RandomAccessSpliterator should be added to the JDK, so non-JDK
List implementations would not suffer significant performance penalties when using
parallelStream. I reported my findings in the
lambda-lib-spec-experts group mailing list. Prior to Java 8, this was the group to communicate anything around Java Streams. The following link shows the thread I started in the mail list the week that I was in London running my tests.
Performance of default Spliterators
Apologies if this was already discussed, thought about, planned or in progress. Right now in the build I am using…
The response to my proposal of a
RandomAccessSpliterator was quite receptive. However, just because I shared an idea, didn’t mean I didn’t have A LOT more work to do. I was planning to attend the JVM Language Summit (JVMLS) in July of 2013 and discuss my findings in person with Brian Goetz and other members of the JSR 335 Expert Group. Unfortunately, a month after I initially reported my findings, my life was thrown into absolute turmoil, and I wound up abandoning the idea for several years.
An unexpected detour and a very thoughtful gift
A week before I was going to go to JVMLS 2013, my wife was admitted to the hospital and diagnosed with AML (Leukemia). I wrote about this story for the first time a couple of years ago.
What I learned about COVID-19 from Acute Myeloid Leukemia
Survival tactics of a germaphobe and spouse of an AML Warrior
With my wife in the hospital receiving invasive chemo for months at a time, and ultimately a stem cell transplant, I was set into autopilot mode just trying to help family survive. Any findings I had shared a couple of months prior were forgotten (but not lost) in the tumultuous times I was now going through.
My friend and colleague Vlad attended the JVMLS in 2013 without me, and returned with a special gift in the form of a JVMLS 2013 coffee thermos.
I keep this incredibly thoughtful gift on my office shelf in front of me where I can see it every day.
Another detour — Eclipse Collections
After Java 8 was released in March 2014, I decided after many discussions that the best path forward for GS Collections was to move to the Eclipse Foundation, where it would become Eclipse Collections. The actual migration work for Eclipse Collections was completed by Hiroshi who would become the first Eclipse Collections Project Lead. Hiroshi Ito is also one of the heroes in this story of
Once Eclipse Collections was migrated to the Eclipse Foundation at the end of 2015, I finally revisited my idea from two years prior. I set out to reconfirm my concerns of the parallel performance of Java Streams with
FastList. The initial release of Eclipse Collections 7.0 was feature identical to GS Collections 7.0. Eclipse Collections 7.0 and GS Collections 7.0 were both compiled with Java 7. This would mean neither library would have overrides for
spliterator() and would inherit any default implementations from
List. The default implementation of
spliterator() was still
IteratorSpliterator in Java 8.
IteratorSpliterator was woefully suboptimal for
List implementations that were
I wrote some benchmarks using JMH (it was now available), and worked with Hiroshi Ito who wrote a first implementation of
RandomAccessSpliterator that I used to test with. IIRC, I ran the benchmarks on a quad core Windows machine (apologies, I can’t recall the exact machine/software specs and didn’t write them down). The following were the results of filtering the evens from a list from 1 to 1,000,000 with the default implementation of
IteratorSpliterator used for
FastList. The results were in operations per second, so bigger is better.
When we plugged in the custom
RandomAccessSpliterator implementation that Hiroshi was working on, this was the change we saw in performance.
This is exactly what we had hoped to see. Once we were confident that the performance differences we were seeing between
RandomAccessSpliterator were both compelling and reproducible, Hiroshi began communicating with the
core-libs-dev mail group.
Onward to OpenJDK
This was the initial email with patch sent to the
core-libs-dev mail group by Hiroshi Ito. The email was dated May 11, 2016 which was three years and two days after I originally shared the idea with the
lambda-lib-spec-experts mail group.
RFR: Implement RandomAccess spliterator
Hi, Please kindly review the attached patch for RandomAccessSpliterator implementation. Currently default…
The implementation work on
RandomAccessSpliterator was a joint effort between Hiroshi Ito and Paul Sandoz. I was just the idea guy and performance testing person. Hiroshi and Paul did all of the real heavy lifting and unit testing for the implementation of
RandomAccessSpliterator. I am extremely thankful for the efforts of both of them.
The following bug exists as a reference in the OpenJDK bug system for this feature.
List.spliterator should optimize for RandomAccess lists
The default implementation of List.spliterator() produces a spliterator from the list’s iterator. If the List…
Now that OpenJDK is on GitHub, the source for
RandomAccessSpliterator can be viewed in the
AbstractList source where it resides as a static class.
jdk/AbstractList.java at master · openjdk/jdk
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below…
Java 9 and RandomAccessSpliterator
RandomAccessSpliterator code was delivered in time to be released with Java 9 (September 2017).
RandomAccessSpliterator would become the default
FastList and any other non-JDK
List in any other library, that did not provide their own
The Long Wait
RandomAccessSpliterator made it in time for Java 9, many companies will have waited for Java 11 before they would have seen any benefits of the optimization. It’s only been fairly recently that a large number of applications have made the migration effort from Java 8 to Java 11 or Java 17.
By the time Java 9 was released, Eclipse Collections had upgraded
FastList (as of EC 8.1) and
ImmutableArrayList (as of EC 9.0) to have custom
spliterator overrides. However there are still classes today in Eclipse Collections that will wind up using the default implementation of
spliterator that returns
RandomAccessSpliterator, including a class named
I decided to run the same benchmarks I ran over 6 years ago only this time using Java 17 with
java.util.ArrayList and Eclipse Collections’
ArrayListAdapter to see if a
parallelStream has reasonably similar performance characteristics.
The results were pretty much as I expected. The
ArrayListAdapter is wrapping the
ArrayList instance, so I did expect it to have some extra cost due to having to delegate all calls to the underlying
ArrayList. That said,
RandomAccessSpliterator appears to be doing its job pretty well.
ArrayList has a specialized
spliterator override which returns an
ArrayListSpliterator and it is not that much different from
RandomAccessSpliterator in this benchmark run.
I ran these benchmarks on a 2.7 Ghz quad core Intel Core I7 MacBook Pro (Early 2013) running with the following JDK and version of JMH.
# JMH version: 1.35
# VM version: JDK 17, OpenJDK 64-Bit Server VM, 17+35–2724
The source code for the benchmarks is here.
Reflections on a road less traveled
Proposing and getting a change into the OpenJDK requires a serious commitment of patience, time and effort. This story demonstrates that it is entirely possible to contribute to the OpenJDK, but may not be on the timeframe you would expect. I’ve also had ideas rejected several times. It’s not personal. The JDK is used by millions of developers for over a quarter century to write and run millions of applications. The developers maintaining this amazing code base are experts and we should be careful not to waste their time.
You should submit your ideas as a proposal to the
core-libs-dev mail list. Search first to see if your idea has already been proposed and/or implemented.
It’s possible you may have a great idea, and it still might get rejected. There is a cost/benefit tradeoff that has to be understood and weighed by the OpenJDK developers. Your amazing idea may be deemed not practical and too expensive.
Many ideas don’t make it, and that is perfectly okay
Here is an example of an idea that I proposed, that has not been implemented yet, and may never be, as it may just not be practical or worth the cost. The developer who responded to my initial proposal, Claes Redestad, was extremely thoughtful and helpful in his responses in the mail list. He even built a proof of concept of the method in a pull request with some benchmarks. I am very thankful for the time he took to look into my idea.
Proposal: Replace foreach loop with Iterable.forEach in String.join(CharSequence, Iterable)
Next message (by thread): Proposal: Replace foreach loop with Iterable.forEach in String.join(CharSequence, Iterable)
There are perspectives that folks who work on the OpenJDK codebase all the time have that regular developers coding in Java may not fully understand or appreciate. Be prepared to have your idea challenged, and to possibly learn that there is something more complex or complicated hiding in that simple code example you provided.
I learned in the
String.join case that the code I was looking at in the JDK 17 release was not actually the code that was currently in the OpenJDK repository being prepared for JDK 18. You can now see the difference for
String.join between JDK 18 and JDK 17.
Start small and stay focused
My recommendation for proposing changes, improvements, bug fixes to the JDK is to start small and stay focused. If you see something, and then work up the courage to say something, be prepared to prove something, and if you get past that point get ready to commit yourself to doing something. Once you are successful with a few small wins, you may be able to move on to bigger changes.
The road might take longer than you expect, but the final destination may be worth it for you. I can tell you after nine years of waiting, and finally experimenting with the performance of
RandomAccessSpliterator in Java 17 and seeing the results I expected, the time I spent and waited was well worth it.
Thank you, Hiroshi, Paul, Brian, Doug, Claes, the entire OpenJDK team and all the members of the JSR 335 Expert Group.