Week #0: Community Building and Getting Started

Published: 06/08/2021


Hi everyone! I’m Xingyu Liu, a first-year data science master student at Harvard University. I’m very excited to be accepted by SciPy and I will work on using Pythran to improve algorithms’ performance in SciPy! There are currently many algorithms that would be too slow as pure Python, and Pythran can be a good tool to accelerate them. My goal is to investigate and improve the slow algorithms, as well as write benchmarks for them.

What did you do this week?

In the community bonding period, I met with my mentors, Ralf Gommers and Serge Guelton. They are very kind, responsive and helpful. We discussed about my project and set up a chat and weekly sync. In the last week, I've started doing my project:


  1. Pythran makes np.searchsorted much slower
  2. u_values[u_sorter].searchsort would cause "Function path is chained attributes and name" but np.search would not
  3. all_values.sort() would cause compilation error but np.sort(all_values) would not

Pull Requests:

  1. ENH: Pythran implementation of _cdf_distance
  2. BENCH: add benchmark for energy_distance and wasserstein_distance


  1. Pythran tutorial.
  2. Profiling Cython code

What is coming up next?

  1. Write benchmarks for inferential stats
  2. Modify to use new random API `rng = np.random.default_rng(12345678)`(according to comments in BENCH: add benchmark for f_oneway )
  3. Finding more potential algorithms that can be speedup via Pythran
  4. Document why some functions can’t be speedup via Pythran

Did you get stuck anywhere?

For my first pull request, we found the Pythran version is not better than the orginal due to the indexing operations.