Week #4: Improving binned_statistic_dd and _voronoi, and fix some issues

Xingyu-Liu
Published: 07/06/2021

What did you do this week?

First came to the old problem, bus error. It turns out that it is specific to Mac. We still don't know the cause of the problem yet.( bus error on Mac but works fine on Linux for _count_paths_outside_method pythran version)

Last week I said that the benchmark result is different from my timeit result. It is actually my mistake: I forgot to modify setup.py. After setting up correctly, the problem was fixed.

Also, for the algorithm binned_statistic_dd I was improving since last week, I have made a PR for it. At first, I improved the whole if-elif block and the benchmark shows it can make count, sum,mean 1.1x times faster, and make std, median, min, max 3x-30x faster . However, I found that Pythran can't support object type input so I failed some tests.To support object type, we need to keep the whole pure Python codes, and it will make the if-elif block duplicate and ugly. Since from the benchmark, there is not much improvement for count, sum,mean, I also tried to only improve std, median, min, max to make it look better and understandable So in the end, I only improved an small inner function but still get std, median, min, max 3x-30x faster, with no changes for count, sum,mean.(ENH: improved binned_statistic_dd via Pythran)

When I was improving binned_statistic_dd, there happened to be an open issue about float point comparision. I looked into that and fixed it.( BUG: fix stats.binned_statistic_dd issue with values close to bin edge )

Last but not least, I tried to speedup _voronoi discussed last week, and the Pythran version is 3x faster than the Cython one!

What is coming up next?

  • Refer to the original Python version rather than the CPython one, make the Pythran version _voronoi more readable. After that, make a PR.
  • Add test for the binned_statistic_dd bug
  • Add benchmarks for somersd and _tau_b
  • Prepare for the first evaluations
  • In Pythran, import some scipy tests

Did you get stuck anywhere?

The bus error mentioned above, and build_docs failed on my PR recently.
DJDT

Versions

Time

Settings from gsoc.settings

Headers

Request

SQL queries from 1 connection

Static files (2312 found, 3 used)

Templates (11 rendered)

Cache calls from 1 backend

Signals

Log messages