try parallelize _compute_pdfs_dense_3d

In the last week, I reorganized my fork of DIPY repo. The repository of aff-par-grad-ori is the original one, that is the clone of upstream/master. In the repository of aff-par-grad-all, I parallelized _gradient_3d() and _sparse_gradient_3d() in vector_fields.pyx using dynamic allocation (malloc). In the repository of aff-par-grad-fun, I parallelized the two function using local function. With respect to these three repositories, I added timer for timing the parallel part of the function in the repositories of xxx-xxx-xxx-xxx-tim. The one using malloc gave me 4.45 times speedup on average, while the one with local function gave me 6.59 times speedup on average, tested on and on the cluster with 24 cores (48 threads).

speedup of parallelized _gradient_3d()

Also, I profiled on these three repositories. Both two methods gave me speedup effect, and the one with local function was a little bit faster.

profile of

Then I tried to parallelize _compute_pdfs_dense_3d() in parzenhist.pyx using local function. However, it even slowed down the execution of Maybe I need to try some other method, like adding a locker or using dynamic allocation (malloc). If I could successfully speedup the execution of _compute_pdfs_dense_3d(), this will make it faster on _update_histogram in the profiling of

Also, I tried to parallelize _joint_pdf_gradient_dense_3d() in parzenhist.pyx. I first tried using dynamic allocation (malloc) and adding a locker (but unfinished). Then I tried to do it using local function. In this way, I need to some local buffer with dimension undefined before hand. Then I need to use dynamic allocation (malloc) for this. Also, for avoiding using ‘with gil’ statement, I need to change _jacobian() to be the ones without memory views. I made these changes in the repository of aff-par-jpdf-fun. So I need to do more test on this. If I parallelize this, it will become faster on _update_mutual_information in the profiling of

Leave a Reply

Your email address will not be published. Required fields are marked *