yashlamba's Blog

Final work submission and future work

yashlamba
Published: 08/26/2019

As the final week ended, we had to submit a compilation of our work during GSoC. Below are some insights:

What was the original aim?

Adding new machine learning models to DFFML, the proposed models are given below:

  1. Model 1: Ordinary Least Square Regression (OLSR)
  2. Model 2: Logistic Regression
  3. Model 3: k-Nearest Neighbour (kNN)
  4. Model 4: Naive Bayes

Decided modifications during community bonding:

During the community bonding period, the proposed work was modified to achieve optimized result from the summer. The finalized work was:

  1. Adding Linear Regression Model from scratch
  2. Adding Linear Regression and other proposed models using scikit-learn
  3. Adding tests for the added models
  4. Documenting the models

Tasks Completed:

  • Added Linear Regression model from scratch with tests

    Simple Linear Regression model implemented from scratch. This was successfully completed with tests and documentation, and was also releasd on PyPI.

  • Added scikit models with dynamic support Initially, it was planned to add certain number of models from scikit but as I did it with one model (Multiple Linear Regression with scikit), we decided to extend this and make a base for all scikit models and make other model classes dynamic. This was successful and now adding scikit models to DFFML is as easy as appending the model name to a python dictionary. The tests are complete and the documentation material is ready but we are still figuring out a more understandable way of documenting this before release.

Future Work:

The project was started just before GSoC'19 and it has come a long way since. I plan on contributing significantly to the project after GSoC'19. Few of the planned stuff:

  1. Adding more scikit models
  2. Working on more machine learning libraries and add models
  3. Contruct DFFML Web UI from scratch which was conceptualized during summer and much more.

 

More detailed report: https://gist.github.com/yashlamba/5e0845a6cd5a1198f166ddedfba78802

 

View Blog Post

Final week check-in

yashlamba
Published: 08/22/2019

Targets for this week:

Almost all my targets and extended goals were achieved before 19th August, this week I have been working on documentation for scikit models in dffml. We want to make the packages docs as clear as possible before release.

About the final evaluation:

I haven't yet submitted the final evaluation at the time writing this blog. I am waiting for documentation to get finished some how in the next couple days so that I can add it too in work chart. I think I have done my work well and I have bonded with project a lot that I think I will continue contributing to this project for a long time.

Challenges:

The challenge right now is thinking of the best way to document scikit model that we are brute forcing over discussions recently. I'll post couple more final blogs soon regarding contributing to larger projects as a beginner and probably one more about how to document your code.

View Blog Post

Final Coding Week - Covering whatever is left

yashlamba
Published: 08/14/2019

Completed Work:

Majority of the stuff I had proposed is complete now. I have successfully added the following models to dffml:

  1. Linear Regression from Scratch (Released with complete documentation)
  2. Scikit Models (Merged and waiting for documentation):
    • https://github.com/intel/dffml/blob/3d04591dc664fcde1b9a95650b40fb76b6569abf/model/scikit/dffml_model_scikit/scikit_models.py#L46-L87

Targets for this week:

Complete the documentation for scikit models and fix a few issues.

Challenges:

I anticipate quite a few challenges this week. As the model class creation for scikit is dynamic, the documentation in particular is going to be a tricky task. Making it understandable and readable for new users is a priority. I'll try to fill it examples for better understanding.

View Blog Post

Contributing to DFFML, A guide to new contributors

yashlamba
Published: 08/06/2019

I wasn't myself a python expert or a machine learning when I started, all you need to have is some patience before contributing to any open source project. (I'll use dffml as a reference)

Browse which project interests you first, this is the most important. You should understand why you want to contribute, is it something you have been using, does it have something you want to learn to do, is it a project assigned to you or something like this. Then first go about the README, read how to setup the project, go through the guidelines and set it up locally. This can be difficult, if you face some problems, ask on the channel probably on irc, slack, gitter, whatever the organisation uses without hesitating. Open source is open, so ask without fear.

Once setup, you should go through the issues. If you are a beginner, there might be a label 'good first issue' or find something you can fix in the docs. Fix it according to the guidelines and open a pull request. It might be long that you have to wait for a review, be patient. Make changes if requested and boom you have made a contribution.

This was a very beginner guide and I'll make sure to make an advanced contribution guide.

View Blog Post

Final Sprint to finish scikit and make it even easier to add models

yashlamba
Published: 08/01/2019

So far I have successfully added 2 models to dffml out which dffml-model-scratch (Linear regression) is already released and available on pypi. Other scikit model has been merged and is awaiting release.

Doing the scikit made us realize that we can do it in better and a much faster way. We have now decided a different procedure to go about adding scikit classifiers such that we have minimum repeated code and adding classifiers is as simple as appending it to a dictionary.

We have just planned this and I can't really assess how much time it would take, but I hope to complete it before the final evaluation. The models I would be adding can be found here: https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html

Link to my first authored pypi package: https://pypi.org/project/dffml-model-scratch/

View Blog Post
DJDT

Versions

Time

Settings from gsoc.settings

Headers

Request

SQL queries from 1 connection

Static files (2312 found, 3 used)

Templates (28 rendered)

Cache calls from 1 backend

Signals

Log messages