Bonus Slides from QCon NY 2023

5 min readJun 18, 2023

The slides that didn’t make the 50 minute time limit for our talk.

No time? No problem.

While working on a performance talk for QCon New York, my co-speaker Rustam Mehmandarov and I had more material than we had time for during our presentation. Our solution was simple. Don’t delete the slides. Move them to the Appendix.

The slides are available as AsciiDoc in this GitHub repo. The talk was about memory-efficiency, and the Appendix contains some more examples folks might find interesting.

I also wrote a prequel blog for the talk, which goes into much more detail about the historical context for the talk. The following is the link to the prequel blog titled “Sweating the small stuff in Java.”

Sweating the small stuff in Java

The story of small FixedSizeCollection types in Eclipse Collections

betterprogramming.pub

Writing the prequel blog saved about 15 minutes from the talk.

Does anyone ever look at the Appendix?

I know I do occasionally. Here’s the Appendix for our talk. You will find some links to resources on the first page, but there is more. The following sections of the blog will show the slides as they would appear in IntelliJ which is what we used along with AsciiDoc in the live presentation.

Appendix 0 — Resources

The first page has some useful links to resources we used or referenced in the talk.

GitHub Repos

Eclipse Collections (creator: Donald Raab)
DataFrame-EC (creator: Vladimir Zakharov)
Jackson Dataformat CSV (creator: @cowtowncoder )
Jackson Datatypes Collections

Kata Repos

Articles

We had referenced the Java Object Layout tool earlier in our talk, which is the tool we used for measuring memory footprints. Here’s a link to the slide with the references to JOL that will help explain how we came up with some of the example slides that follow. The following image shows the slide as it appeared in our talk.

Appendix 1 — Boxed vs. Primitive Lists

We didn’t have time to show every memory cost comparison during the talk that we did, so here’s the one where we compared a java.util.ArrayList of Integer with an IntArrayList. Each List contains integer values 1 through 10.

Memory Footprint — Boxed vs. Primitive Lists

Note, the extra cost here of 160 bytes for ArrayList is due to the boxing of int values as Integer instances.

Appendix 2 — Mutable vs. Immutable Lists

The JDK provides both Mutable and Immutable List implementations now. They both implement the List interface. Most folks won’t realize that the Immutable List implementations are more memory efficient than their Mutable counterparts. This is because they are trimmed-to-size since they don’t change. There are ImmutableCollections$ListN and ImmutableCollections$List12 implementations. The latter should be read as ListOneTwo, not ListTwelve, which is how I read it when I first saw the class. This class contains either one or two elements.

In this example, we created a List with two Integer instances. The first class we used is ArrayList and then we created a copy of the ArrayList into an Immutable List using List.copyOf().

Memory Footprint — Mutable vs. Immutable Lists

The boxing cost is the same between the Mutable and Immutable List implementations in the JDK, but the List12 instance does not have a default sized array of size 10 like the ArrayList.

Appendix 3 — Boxed vs. Primitive Map of Long → Set of Long

I was asked on Twitter if there was a more efficient way of creating a Map of Long to Set of Long for 200,000 Long keys using Eclipse Collections. The short answer is yes, as long as you don’t box the Long values.

24 bytes for each Long object. These can add up quickly depending on your use cases. Don’t box!

Appendix 4 — Caching vs. Pooling

We discussed pooling in our talk, and desxcribed some of the pools built into the JDK like String.intern() and the boxed Number pools available through valueOf methods on the integral value types Byte, Short, Integer, and Long. Caching is subtly different in that lookups for an object are usually provided via some index. Pooling provides uniquing and lookup is based on the instance you are looking for.

Country is implemented as a record, and we keep a cache of Country instances indexed by the country name in a Map.

Cache of Country instances by country name

Appendix 5 — Scaling Conferences x50

In the talk, we covered an example that scaled from 1 million Conference instances to 25 million. A few days before the talk, we tried it again with 50 million and 100 million instances, with the memory tuning done for one of the four row based solutions (Eclipse Collections ImmutableList). The attempt to load 100 million instances failed with OutOfMemoryError. I did not have time to research what the cause of the OutOfMemoryError and see if it was fixable.

Here is the slide with 50 million instances of Conference.

Conference Explorer — Memory Cost — Scaling Conferences x50

The intent here is to show how scaling impacts total memory savings. By manually tuning one of the row based solutions with a savings of 16 bytes per Conference, we were able to save over 800MB of memory. If you target the multipliers in your data, even small memory savings can become significant.

Thank you and Enjoy!

Rustam and I had a blast presenting at QCon New York this year, and wanted to thank the conference organizers, our track host Neha Sardana, and everyone who attended our talk! I hope you enjoy the bonus slides I shared here that didn’t make the cut for the talk.

Thank you for reading, and Happy Father’s Day!

I am the creator of and committer for the Eclipse Collections OSS project, which is managed at the Eclipse Foundation. Eclipse Collections is open for contributions.