yashlamba's Blog

Final week check-in

Published: 08/22/2019

Targets for this week:

Almost all my targets and extended goals were achieved before 19th August, this week I have been working on documentation for scikit models in dffml. We want to make the packages docs as clear as possible before release.

About the final evaluation:

I haven't yet submitted the final evaluation at the time writing this blog. I am waiting for documentation to get finished some how in the next couple days so that I can add it too in work chart. I think I have done my work well and I have bonded with project a lot that I think I will continue contributing to this project for a long time.


The challenge right now is thinking of the best way to document scikit model that we are brute forcing over discussions recently. I'll post couple more final blogs soon regarding contributing to larger projects as a beginner and probably one more about how to document your code.

View Blog Post

Final Coding Week - Covering whatever is left

Published: 08/14/2019

Completed Work:

Majority of the stuff I had proposed is complete now. I have successfully added the following models to dffml:

  1. Linear Regression from Scratch (Released with complete documentation)
  2. Scikit Models (Merged and waiting for documentation):
    • https://github.com/intel/dffml/blob/3d04591dc664fcde1b9a95650b40fb76b6569abf/model/scikit/dffml_model_scikit/scikit_models.py#L46-L87

Targets for this week:

Complete the documentation for scikit models and fix a few issues.


I anticipate quite a few challenges this week. As the model class creation for scikit is dynamic, the documentation in particular is going to be a tricky task. Making it understandable and readable for new users is a priority. I'll try to fill it examples for better understanding.

View Blog Post

Contributing to DFFML, A guide to new contributors

Published: 08/06/2019

I wasn't myself a python expert or a machine learning when I started, all you need to have is some patience before contributing to any open source project. (I'll use dffml as a reference)

Browse which project interests you first, this is the most important. You should understand why you want to contribute, is it something you have been using, does it have something you want to learn to do, is it a project assigned to you or something like this. Then first go about the README, read how to setup the project, go through the guidelines and set it up locally. This can be difficult, if you face some problems, ask on the channel probably on irc, slack, gitter, whatever the organisation uses without hesitating. Open source is open, so ask without fear.

Once setup, you should go through the issues. If you are a beginner, there might be a label 'good first issue' or find something you can fix in the docs. Fix it according to the guidelines and open a pull request. It might be long that you have to wait for a review, be patient. Make changes if requested and boom you have made a contribution.

This was a very beginner guide and I'll make sure to make an advanced contribution guide.

View Blog Post

Final Sprint to finish scikit and make it even easier to add models

Published: 08/01/2019

So far I have successfully added 2 models to dffml out which dffml-model-scratch (Linear regression) is already released and available on pypi. Other scikit model has been merged and is awaiting release.

Doing the scikit made us realize that we can do it in better and a much faster way. We have now decided a different procedure to go about adding scikit classifiers such that we have minimum repeated code and adding classifiers is as simple as appending it to a dictionary.

We have just planned this and I can't really assess how much time it would take, but I hope to complete it before the final evaluation. The models I would be adding can be found here: https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html

Link to my first authored pypi package: https://pypi.org/project/dffml-model-scratch/

View Blog Post

Getting started with new models

Published: 07/23/2019

What did I do this week?

This week was mostly about testing and debugging the scikit linear regression model. After that I implemented saving and loading of the model, which took some time is debugging itself. This work was completed by Friday and I spent the weekend studying some other models including k-Nearest Neighbors, K Means and learnt about support vector machines.

Did I get stuck?

Oh, at loads of places. Most surprising and funny was that I was receiving negative accuracy out of scikit model and I had absolutely no idea what it meant. Now I have got it and mentors are looking into how to we can make this work for DFFML.

I got stuck in saving and loading too, as scikit offers saving and loading with pickle or joblib, I also had to save confidence of the model in a JSON that took tons of ideas and debugging.

Plans for upcoming week?

It'll probably be discussed in the weekly sync and previous two models would be merged.

View Blog Post