Many simple for loops in Python can be replaced with list comprehensions. You can often hear that list comprehension is more Pythonic [almost as if there was a scale for comparing how Pythonic something is, compared to something else ]. In this article, I will compare their performance and discuss when a list comprehension is a good idea, and when its not.
Filter a list with a for loop
Lets use a simple scenario for a loop operation - we have a list of numbers, and we want to remove the odd ones. One important thing to keep in mind is that we cant remove items from a list as we iterate over it. Instead, we have to create a new one containing only the even numbers:
if not element % 2 is equivalent to if element % 2 == 0, but its slightly faster. I will write a separate article about comparing boolean values soon.
Lets measure the execution time of this function. Im using Python 3.8 for benchmarks [you can read about the whole setup in the Introduction article]:
It takes 65 milliseconds to filter a list of one million elements. How fast will a list comprehension deal with the same task?
Filter a list with list comprehension
For loop is around 50% slower than a list comprehension [65.4/44.51.47]. And we just reduced five lines of code to one line! Cleaner and faster code? Great!
Can we make it better?
Filter a list with the filter function
Python has a built-in filter function for filtering collections of elements. This sounds like a perfect use case for our problem, so lets see how fast it will be.
284 nanoseconds?! Thats suspiciously fast! It turns out that the filter function returns an iterator. It doesnt immediately go over one million elements, but it will return the next value when we ask for it. To get all the results at once, we can convert this iterator to a list.
Now, its performance is not so great anymore. Its 133% slower than the list comprehension [104/44.52.337] and 60% slower than the for loop [104/65.41.590].
While, in this case, its not the best solution, an iterator is an excellent alternative to a list comprehension when we dont need to have all the results at once. If it turns out that we only need to get a few elements from the filtered list, an iterator will be a few orders of magnitude faster than other non-lazy solutions.
filterfalse[]
We could use the filterfalse[] function from the itertools library to simplify the filtering condition. filterfalse returns the opposite elements than filter. It picks those elements that evaluate to False. Unfortunately, it doesn't make any difference when it comes to performance:
from itertools import filterfalse def filterfalse_list[]: return list[filterfalse[lambda x: x % 2, MILLION_NUMBERS]]$ python -m timeit -s "from filter_list import filterfalse_list" "filterfalse_list[]" 2 loops, best of 5: 103 msec per loopMore than one operation in the loop
List comprehensions are often faster and easier to read, but they have one significant limitation. What happens if you want to execute more than one simple instruction? List comprehension cant accept multiple statements [without sacrificing readability]. But in many cases, you can wrap those multiple statements in a function.
Lets use a slightly modified version of the famous Fizz Buzz program as an example. We want to iterate over a list of elements and for each of them return:
- fizzbuzz if the number can be divided by 3 and 5
- fizz if the number can be divided by 3
- buzz if the number can be divided by 5
- the number itself, if it cant be divided by 3 or 5
Here is a simple solution:
Here is the list comprehension equivalent of the fizz_buzz[]:
Its not easy to read - at least for me. It gets better if we split it into multiple lines:
But if I see a list comprehension that spans multiple lines, I try to refactor it. We can extract the if statements into a separate function:
Now its trivial to turn it into a list comprehension. And we get the additional benefit of a nice separation of logic into a function that does the fizz buzz check and a function that actually iterates over a list of numbers and applies the fizz buzz transformation.
Here is the improved list comprehension:
Lets compare all three versions:
Extracting a separate function adds some overhead. List comprehension with a separate transform[] function is around 17% slower than the initial for loop-based version [224/1911.173]. But its much more readable, so I prefer it over the other solutions.
And, if you are curious, the one-line list comprehension mentioned before is the fastest solution:
Fastest, but also harder to read. If you run this code through a code formatter like black [which is a common practice in many projects], it will further obfuscate this function:
There is nothing wrong with black here - we are simply putting too much logic inside the list comprehension. If I had to say what the above code does, it would take me much longer to figure it out than if I had two separate functions. Saving a few hundred milliseconds of execution time and adding a few seconds of reading time doesnt sound like a good trade-off .
Clever one-liners can impress some recruiters during code interviews. But in real life, separating logic into different functions makes it much easier to read and document your code. And, statistically, we read more code than we write.
Conclusions
List comprehensions are often not only more readable but also faster than using for loops. They can simplify your code, but if you put too much logic inside, they will instead become harder to read and understand.
Even though list comprehensions are popular in Python, they have a specific use case: when you want to perform some operations on a list and return another list. And they have limitations - you cant break out of a list comprehension or put comments inside. In many cases, for loops will be your only choice.
I only scratched the surface of how useful list comprehension [or any other type of comprehension in Python] can be. If you want to learn more, Trey Hunner has many excellent articles and talks on this subject [for example, this one for beginners].