AshwinB's Blog

Weekly check-in #5: 19/06 to 27/06

AshwinB
Published: 06/28/2019

Hello everyone. Welcome to my fifth of the 12 Weekly updates. 

What did you do this week?

This week I finally broke the curse and finished the `xgboost` and `lightgbm` implementation for SHAP predictions. The problem  I was tackling last week was due to me trying to implement support for multiple documents which is not supported in eli5 already. But with clarification from @prado (my mentor), I was able to implement it for a single document that is already present in the eli5.  Right now,  I'm making minor changes to the code to make it more compact before merging and also Implementing it for `catboost ` which will lead to the completion of Phase One. All the existing test cases have also been run on my implementation and hence it is almost ready. Upon suggestion from @prado I am also writing a few extra testcases to check the validity of the model explanation.

What is coming up next?

  • Adding additional test cases for the present implementation of `xgboost` and `lightgbm`.
  • Complete `catboost` implementation.
  • comment and document the already integrated implementation.
  • Finish PhaseOne And Begin with PhaseTwo.

Did you get stuck anywhere?

This week for fairly smooth compared to last week. Most of my time was spent in discussing and trimming code according to my mentors suggestions.
Thank You for staying till the end. :)
View Blog Post

Weekly check-in #4: 10/06 to 18/06

AshwinB
Published: 06/19/2019

Hello everyone. Welcome to my fourth of the 12 Weekly updates. 

What did you do this week?

This week I was trying to integrate the eli5 display for Multi-class classification weights. For doing the same I was considering options between - adding one more display feature to core eli5 module or manipulating my implementation to make use of the already existing display function.  I was stuck with using the existing display functions.

What is coming up next?

  • Sort out the mess I am In. This is w.r.t the multi-class classification display in the eli5 module.
  • Raise A PR for Phase One. (done)
  • comment and document the already integrated implementation.

Did you get stuck anywhere?

I was stuck for a couple of days on the multi-class implementation, I am currently working towards atleast getting xgboost completly working and adhering to the already present test cases. With suggestions from my mentor, I am working in a new direction. This is a common function, so if sorted, it will apply to catboost and lightgbm multi-class classification.

Thank You for staying till the end. :)
View Blog Post

Weekly check-in #3: 30/05 to 9/06

AshwinB
Published: 06/09/2019

Hello everyone. Welcome to my third of the 12 Weekly updates. 

What did you do this week?

My decision making process was tested this week. According to my timeline I was supposed to be done with PhaseOne of my project by this week, but I am only 2/3 there. The reason being, the code was getting duplicated and I had to make decisions regarding reusing or customizing existing functions without breaking the previous implementations. Most of my time was spent making decisions than coding. My mentor then suggested to make draft pull requests, so he can review whether the path I was following was desired or not. Based on FeedBack I was able to polish the existing Implementation and write test cases for the same.

What is coming up next?

  • Integrate LightGBM shap explainability in eli5
  • Raise A PR for Phase One.
  • Start to investigate the TreeShap Paper and break it down, cross checking with the TreeExplainer of SHAP library to incorporate sklearn tree based models. (This will be mentioned every week over the next 2 development weeks for each model incorporated that week.)   

Did you get stuck anywhere?

I was stuck up trying to reuse code vs making custom function while duplicating some portion of the reusable code. @prado helped me figure out that I could omit some variable and use the partial function from functools to achieve reusability.
Thank You for staying till the end. :)

View Blog Post

View Blog Post

Weekly check-in #2: 20/05 to 29/05

AshwinB
Published: 05/30/2019

Hello everyone. Welcome to my second of the 12 Weekly updates. 

What did you do this week?

Got some solid work done this week. My PR with previous changes was accepted this week and is live in the eli5 library, i.e. catboost. @prado was very patient with me as I had some cosmetic changes in the docs and explained the reasons iterating through my commits and code reviews. Also I began work on the Phase One(refer proposal for details) of the project on a separate branch and am at the completion of 1/3 of the intended work for it.  I have set my development environment and updated my fork with the recent two PR's. I also started dissected the TreeSHAP paper into notes and will be trying to better grasp the algorithm for Phase Two. 

What is coming up next?

  • Integrate Xgboost shap explainability in eli5 
  • Integrate CatBoost shap explainability in eli5
  • Integrate LightGBM shap explainability in eli5
  • Raise A PR for Phase One.
  • Start to investigate the TreeShap Paper and break it down, cross checking with the TreeExplainer of SHAP library to incorporate sklearn tree based models. (This will be mentioned every week over the next 2 development weeks for each model incorporated that week.)   

Did you get stuck anywhere?

I was caught up in trying to understand the shap library implementation. But then found a post by the person who published the paper and also build the library very helpful. So I am good regarding any blockers. 
Thank You for staying till the end. :)
View Blog Post

Weekly check-in #1: 13/05 to 20/05

AshwinB
Published: 05/21/2019

Hello everyone. Welcome to my first of the 12 Weekly updates. I am working on integrating SHAP explainability into ELI5 (ScrapingHub).

What did you do this week?

This week was mostly spent on Discussion with the mentors to change some Portions of my proposal to Streamline it better with the organization's goal. Different Implementations were considered and corresponding pros/cons listed out. With help from @prado and @kmike, my proposal was narrowed down to 3 Main Phases. Accordingly I updated My proposal and re-uploaded on it on this site. I have contributed a feature to ELI5 before and wasn't aware of how to use tox and the native testing libraries in python when I started but was able to pass it with continuous feedback from @Konstantin. Hence, this week I invested time to pick the best practices in python unit testing and  Also started getting familiar problem with Travis CI.  

What is coming up next?

  • Setting up Tox on my local environment and installing all the dependencies.
  • Keep Learning about Tox and unit testing.
  • Get more familiar with PEP8 as dependency was increasing on autopep8 for formatting.
  • Start to investigate the TreeShap Paper and break it down, cross checking with the TreeExplainer of SHAP library to incorporate sklearn tree based models. (This will be mentioned every week over the next 3 development weeks for each model incorporated that week.)   

Did you get stuck anywhere?

No specific Issues, But a problem statement regarding Black Box Models Implementation A.K.A Phase 3 which requires more ground work. My main goal will be to quickly integrate phase 1 and phase 2 to spend maximum time on finding optimal method for phase 3.

 

Thank You for staying till the end. :)

Ashwin Bhat

View Blog Post