What did you do this week?As this project progressed, I came to realize that the diffculty of this project is to find good potential algorithms. This week I was searching potential algorithms in
scipy.stats. At first, like the past, I used
pytest --durations=50to find the slowest 50 tests. I looked into all the functions but didn't find any suitable algorithms: some do not have obvious loops; some have loops but call another scipy method in the loop... Therefore, I began checking all the functions under
scipy/statsone by one and finally found
_tau_bare good candidates and I submitted a PR( gh-14308) to speedup them 4x~20x.
Besides, For the works that mentioned last week:
stats._moment: submitted an issue(pythran #1820) for keep_dims is not supported in np.mean()
stats._calc_binned_statistic: successfully improved this function and made the public function
stats.binned_statistic_dd3x-10x faster on
min,max,std,median. I tried to improved the whole
if-elifblock but encountered some errors that I can't fix (see pythran #1819 )
stats._sum_abs_axis0: Thanks to Serge, the compliation errror due to variant type is fixed. I compiled and it is ~2x faster on
_sum_abs_axis0but do not have much gain on the public function
onenormest. Moreover, actually there is no loop in
_sum_abs_axis0for input size smaller than 2**20(my bad!)
sparse.linalg.expm(_fragment_2_1): Last week I said this one is slower than the pure python version but it is not. My bad, actually with my input at that time, it will not get into
_fragment_2_1so I actually didn't test it. Only when the input size is larger than 9000 it will get into the function. Moreover, the input is
csc matrixso it is not a suitable one for pythran.
- the SciPy build error(see pythran #1815) mentioned last week: It is a really struggling problem. Serge and Ralf tried to help me fix that but it is still not working for now.
What is coming up next?
- add benchmarks for
- consider merging the old benchmark PR?(gh-14228)
- keep searching good potential algorithms to be improved.
Did you get stuck anywhere?The problem I encountered when improving the whole
stats.binned_statistic_dd(see pythran #1819 )