As the final week ended, we had to submit a compilation of our work during GSoC. Below are some insights:
What was the original aim?
Adding new machine learning models to DFFML, the proposed models are given below:
- Model 1: Ordinary Least Square Regression (OLSR)
- Model 2: Logistic Regression
- Model 3: k-Nearest Neighbour (kNN)
- Model 4: Naive Bayes
Decided modifications during community bonding:
During the community bonding period, the proposed work was modified to achieve optimized result from the summer. The finalized work was:
- Adding Linear Regression Model from scratch
- Adding Linear Regression and other proposed models using scikit-learn
- Adding tests for the added models
- Documenting the models
Tasks Completed:
-
Added Linear Regression model from scratch with tests
Simple Linear Regression model implemented from scratch. This was successfully completed with tests and documentation, and was also releasd on PyPI.
-
Added scikit models with dynamic support Initially, it was planned to add certain number of models from scikit but as I did it with one model (Multiple Linear Regression with scikit), we decided to extend this and make a base for all scikit models and make other model classes dynamic. This was successful and now adding scikit models to DFFML is as easy as appending the model name to a python dictionary. The tests are complete and the documentation material is ready but we are still figuring out a more understandable way of documenting this before release.
Future Work:
The project was started just before GSoC'19 and it has come a long way since. I plan on contributing significantly to the project after GSoC'19. Few of the planned stuff:
- Adding more scikit models
- Working on more machine learning libraries and add models
- Contruct DFFML Web UI from scratch which was conceptualized during summer and much more.
More detailed report: https://gist.github.com/yashlamba/5e0845a6cd5a1198f166ddedfba78802