Learning a little Python using the Eclipse Collections Pet Kata

Eclipse Collections Pet Kata on GitHub (left) : Python implementation of Person class from Pet Kata

A time to learn Python

I’ve programmed in more than twenty programming languages over the past four decades. I’ve coded in Java for 20+ years, and am the creator of a Java collections framework called Eclipse Collections. I have never tried to learn Python before. That is, until now.

Why learn Python?

I am a Java Champion, and regularly teach and advocate for the Java programming language. My learning Python might seem a bit odd to some. In fact, some of my friends started pinging me on Twitter after I posted several tweets about me solving the Pet Kata using Python wondering if I was having a mid-life crisis or if I had been abduckted. I tried to alleviate their concerns with a tweet.

Python is an older programming language than Java, which might surprise some folks. I often recommend to developers to look at older languages to learn new techniques and approaches. I try to practice what I preach.

I recently shared my experience returning to my nostalgic Smalltalk days and implementing the Deck of Cards Kata using the Pharo Smalltalk IDE. Programming in Smalltalk using unit tests was a tremendous amount of fun. You can see the results in the following blog.

I thought that I might as well make the most of my recent visit to a dynamic language I knew well and see how quickly I would be able to code productively in Python. Just as I did with Smalltalk, I chose a code kata to implement. This time I went with the Pet Kata. I’ve often told folks they can use the Eclipse Collections Katas to learn any programming language, as long as there is a way to specify unit tests and implement a domain in the language.

I made the job of learning Python a lot easier by using the quite capable PyCharm IDE from JetBrains. PyCharm felt very familiar to me, because I have programmed in the IntelliJ IDE from JetBrains for almost twenty years.

The Domain

To start things off, I implemented the domain classes. I knew Python supported object-oriented programming, so I created three classes to represent Person, Pet, and PetType. A Person has zero or more Pets and each Pet has a PetType, which I implemented via an Enum, which is a type both Java and Python support.

This is how I implementedPetType.

PetType Enum

This is how I implemented the Pet class.

Pet Class

This is how I implemented the Person class.

Person class

You can compare the solutions for the domain classes in Python, to the solutions in Java using Eclipse Collections here.

What I learned in the domain classes?

Space matters in Python. So do spaces. There are no curly braces like in Java, and Python is less forgiving than Smalltalk in terms of how code is formatted and edited.

You don’t define instance variables as you would in Java or Smalltalk, with a special declaration area. The variables come into existence as you define them in the init method. Since Python is dynamic, there is no type to specify.

The self keyword is the same name Smalltalk uses for the instance, and is the equivalent of this in Java.

The [] is a literal form of an empty list. You can add things to a list using the append method.

You define a function using def, followed by the function name, and a parameter list in parentheses and then ending the definition line with a :. When you define an instance method, the first parameter in every method needs to be self.

To implement an Enum, you need to import the Enum type from the enum module or package (not too sure on the terminology here yet). You then have your type extend the Enum type, by putting the Enum type inside the parentheses after the class name, which is PetType in this case.

The lambda keyword is used to create a lambda, or anonymous function. The parameters are specified after the keyword, and separated from the expression with a :. I used lambdas with both map and filter functions.

Functions I used in the domain were any, len and map. This is one place where I think Python is missing some object-oriented goodness of encapsulated methods. I think any, len and map should be methods on the list type, not functions that accept an iterable type. Making them methods instead of functions would make them easier to discover. The IDE can help you with code assist if you type . after a type and show you a method list of available methods. This made it harder to discover these functions without using Google with questions like “How do I find the size of a list in python?”

The Tests in the Kata

There are five separate exercises in the Pet Kata. I decided I would only implement the first four exercises, because that would be enough to get started and develop a feel for what it is like to program in Python.

I had to learn how to develop a unit test in Python, so I started Googling for the info. What I discovered is that unit testing in Python is somewhat similar to the unit testing I was familiar with in JUnit 3.x versions. Each test class has to extend unittest.TestCase, and each test method must be prefixed with the word test.

The assertions are available directly on the instance of the class using the self keyword. So to assert equality, you would use the method self.assertEqual().

PetDomainForKata

Before I started writing the tests, I created the test data I used in the tests. In Java, I put this code in an abstract class, but I had trouble figuring out how to define an instance variable in a parent class and then extending the class and accessing the variable. So instead I decided to just create a function called get_people() which I used in each test case to get the list of people with their pets.

Test data is a list of people with their pets and ages

You can compare the solutions for the test data setup in Python, to the solutions in Java using Eclipse Collections here.

Exercise 1 Test

The tests for exercise 1 in the Pet Kata are fairly straight forward. You need to learn how to transform and filter a list in this exercise.

Transforming and filtering lists using map and filter

In this test class, I learned how to use the functions map, filter, list, and len. Both map and filter return iterators which then need to be converted to lists using the list function.

You can compare the solutions for exercise 1 in Python, to the solutions in Java using Eclipse Collections here.

Exercise 2 Test

Most of the tests for exercise 2 in the Pet Kata are also fairly straightforward.

Using any, all, sum, list and filter to complete tests in exercise 2

In this test class, I learned how to use the any, all , sum methods in addition to what I learned in exercise 1. I was surprised there was no count function, but I was able to find how to use sum with map and filter to accomplish what was needed.

The one test that was not straightforward was the one where I needed to use flatMap. Python does not have a function called flatMap, and instead you need to use a function named chain in a library called itertools along with map.

Learning how to flatten a persons pet types using chain and map

You can compare the solutions for exercise 2 in Python, to the solutions in Java using Eclipse Collections here.

Exercise 3 Test

The tests in exercise 3 are much harder to complete. This is because they do more complex things like counting and grouping things by other things.

Implementing countByEach, groupBy and groupByEach in Python

I had to use both the collections.Counter class and itertools.chain function in order to implement the method known in Eclipse Collections as countByEach. The method countByEach is a flatCollect followed by a countBy. In Eclipse Collections, countBy will return a Bag type. I learned that the equivalent type in Python is called Counter.

There is a groupBy function in itertools, but it didn’t quite behave as I had hoped. I could not find a simple way to transform the result from groupBy to a dictionary, so I followed the recommended imperative way of using a for loop and converting each group value to a list using the list function. It was interesting to see how to iterate over key and group together in the for loop.

Finally, I had to switch to completely imperative approach with nested loops to convert the list of people into a dictionary of pet types mapped to sets of people.

You can compare the solutions for exercise 3 in Python, to the solutions in Java using Eclipse Collections here.

Exercise 4 Test

After exercises 2 and 3, I was was worried exercise 4 was going to be even more difficult. It turned out to not be so hard.

Calculating the age statistics for Pets with min, max, sum and median

I couldn’t find a way to calculate min, max, sum and mean in Python using a single iteration like we do in Java using IntSummaryStatistics, but there are functions for min, max, sum and mean (you need to import the statistics module for mean). I was able to use any and all again, but I could not find an equivalent of none, so used assertFalse with any.

Joining strings, most common, median

For the remaining tests I learned how to use the join method on a string along with map to get a string representation of the names of Bob’s Smiths pets. I then used Counter along with a method named most_common to find the top three pets. Finally, I used the statistics.median function to calculate the median age of the pets.

You can compare the solutions for exercise 4 in Python, to the solutions in Java using Eclipse Collections here.

Lessons Learned

I learned a little bit about Python in this exercise. It was interesting to see how quickly I could get a bunch of tests written in Python up and running. Where I slowed down a bit was in learning whether I needed a function or a method to accomplish something. It was also challenging to understand what the type of something was so I could explore its API in more detail. I was able to leverage the debugger in PyCharm a bit, and it helped in some cases.

I was comparing and contrasting the Python solutions I came up with along with Java solutions using Eclipse Collections on Twitter. What I discovered along the way was that there were APIs in Eclipse Collections that weren’t used in the solutions that actually made things more concise. I submitted two pull requests to the Eclipse Collections Kata as a result. The additional methods I used to refactor some of the solutions were countByEach, flatCollectInt, and containsBy. These are more advanced “fused” APIs that have evolved over time. You can read more about the evolution of the containsBy “fused” API in the following blog.

I hope you enjoyed this learning experiment implementing the Pet Kata in Python. I definitely feel like I have developed some basic understanding of the Python programming language as a result. And now that I have written this blog, I always have something I can refer back to.

I am a Project Lead and Committer for the Eclipse Collections OSS project at the Eclipse Foundation. Eclipse Collections is open for contributions. If you like the library, you can let us know by starring it on GitHub.

Java Champion. Creator of the Eclipse Collections OSS Java library (http://www.eclipse.org/collections/). Inspired by Smalltalk. Opinions are my own.