Week #2: Improving stats.ks_2samp

Xingyu-Liu
Published: 06/21/2021

What did you do this week?

This week was quite struggling. My mentor Serge implemented supporting for `scipy.special.binom` quickly in Pythran and it shows a great improvement on the public function stats.ks_2samp( 2.62ms vs 88.2ms). However, when I was building scipy with the improved algroithm, we found that it would cause a loop problem. Serge made a PR to break the loop but in my computer it is still not working.

Then I turned to try other algorithms that I mentioned last week, however I encountered more problems:

  1. stats._moment: keep_dims is not supported in np.mean()
  2. stats._calc_binned_statistic: invalid pythran spec but I don't find anything wrong
  3. stats.rankdata: invalid pythran spec
  4. stats._sum_abs_axis0: compliation error
  5. sparse.linalg.expm(_fragment_2_1): much slower than the pure python one, will keep investigating it.

What is coming up next?

  • submit issue for keep_dims
  • submit issue for error in stats._sum_abs_axis0
  • continue improving stats._calc_binned_statistic
  • Find out why sparse.linalg.expm pythran version is slower.

Did you get stuck anywhere?

Got stuck in many problems, as is written in What did you do this week section.