A simple trick to supercharge your loops using “loop unrolling”
Today’s devices are incredibly powerful, often leading us to overlook the importance of efficiency and optimizations. It’s easy to think, why bother with efficiency when we have high-performance hardware like the monster M2 SoC in our Macs or iPads? However, adopting such a mindset is not healthy. Occasionally, it’s essential to improve the fundamentals and consider code optimization tips. They have the potential to enrich our knowledge and enhance our skills as developers, even if they may not always be practical.
Now, let’s explore a function we use daily: the filter method.
Writing our own filter method
Filter arrays is a common task, that’s why it’s fun to try optimizing it.
Let’s take an array of names and try to filter a specific name:
The code for this task is quite straightforward. We iterate over the array from the lower bound to the upper bound, access each item, and compare it. Finally, we return all the filtered items.
I performed a test on 500,000 items, and the execution time was 0.056 seconds. Not bad! Now, let’s explore some potential optimizations to further improve its performance.
Using forEach
Some of you may have noticed that the previous function seemed a bit cumbersome. After all, why bother with creating lower and upper bounds when we can simply utilize the forEach method on the array? Let’s take a look at an example of the same code but implemented using forEach:
Now, when I execute the code on the same 500,000 items, the results are significantly faster, with an execution time of 0.031 seconds. That’s a remarkable 42% improvement!
But why does this happen? Well, beneath the surface, the forEach function itself is still performing an iteration. However, it’s worth noting that forEach is a built-in method that may be optimized by the underlying runtime or compiler, potentially resulting in improved performance compared to manually implemented loops.
Iteration in Swift is often more optimized than accessing items individually. Sequential iteration allows the compiler to perform efficient item access and even pre-fetch items to minimize latency. However, can we push the boundaries of optimization even further? Let’s embark on that journey and explore additional ways to enhance our code 😊
Handle two items in each iteration
Up until now, we have been handling one item 500,000 times. However, what if we try a different approach and handle two items together, but with only 250,000 iterations? I have a hunch that processing 500,000 iterations might come with a performance cost. Let’s put it to the test and see the results:
Recall the execution time of 0.031 seconds when we used the forEach loop. With this new approach, we have achieved an impressive performance boost, resulting in an execution time of just 0.007 seconds. This represents a remarkable 71% improvement compared to the forEach loop, and an astounding 87% improvement compared to our original code!
Look at the differences:
That’s amazing, but why does it happen?
The trick is called “loop unrolling”. In loop unrolling, we execute multiple steps in one loop iteration, thereby decreasing the total number of iterations. You may find yourself asking, “But it’s the same logic, what the hell?” Well, it turns out that having many iterations comes with a cost. Firstly, the total number of instructions increases because handling the loop, checking the condition, and jumping to the start all require additional instructions. Moreover, once the CPU “knows” more about the following instructions it needs to perform, it can optimize itself better.
Final words about optimizations
The first rule of optimizations is “Don’t”.
The second rule of optimizations is “Not yet”.
Please, avoid from immediately “unrolling” your loops as a result of this post. The intention here is to load you with the knowledge to become a better programmer. Someday, you may encounter scenarios with heavy loops that require optimization, and loop unrolling may be a viable solution. In your daily work, 99% of the loops will work just fine.
Boost Your Swift Loop Performance By 87% was originally published in Better Programming on Medium, where people are continuing the conversation by highlighting and responding to this story.