Your Escape Plan From Numpy + Cython
|Countdown link||Open timer|
In this talk, a math equation will be given as a benchmark. Instead of optimizing it with Cython (painful and unpythonic), I will give three Pythonic solutions to accelerate NumPy. At the end, the pros and cons of three solutions will be given as well as some recommendations based on my experience.
If you've been a data scientist or researcher long enough, you must have encountered a situation where your NumPy code ran quickly on small datasets in a testing environment but performed poorly on real-world datasets (100x larger or more). In this talk, I will introduce three Pythonic solutions to improve NumPy performance drastically without modifying too many codes.
At the beginning of the talk, a math equation: logsumexp, which is widely used in machine learning, will be illustrated. I will show how it is implemented with pure NumPy and use it as a benchmark so we can compare it to three proposed solutions at the end of the talk.
Then, three solutions: CuPy, Numba, and Pythran will be presented in separate sections. In each section, I will give a brief introduction to the solution and show how to apply this solution to our benchmark code.
At the end of the talk, I will compare these solutions from different aspects:<pre>
* How much performance is boosted after each solution is applied * Ease to apply on your existing code (including the ease of debugging) * Limitations of each solution * Which solution should be applied first in given scenarios</pre>
Last but not the least, I will show a relatively new but interesting solution: Transonic to the audience so they can give it a try on their side project.
A Python performance tuning enthusiast tweaks ML platform for Cybersecurity company. I'm also a cybersecurity hobbyist poking websites on the Internet.