Articles on yashlamba's Bloghttps://blogs.python-gsoc.orgUpdates on different articles published on yashlamba's BlogenMon, 26 Aug 2019 15:58:47 +0000Final work submission and future workhttps://blogs.python-gsoc.org/en/yashlambas-blog/final-work-submission-and-future-work/<p>As the final week ended, we had to submit a compilation of our work during GSoC. Below are some insights:</p> <p><strong>What was the original aim?</strong></p> <p>Adding new machine learning models to DFFML, the proposed models are given below:</p> <ol> <li>Model 1: Ordinary Least Square Regression (OLSR)</li> <li>Model 2: Logistic Regression</li> <li>Model 3: k-Nearest Neighbour (kNN)</li> <li>Model 4: Naive Bayes</li> </ol> <p><strong>Decided modifications during community bonding:</strong></p> <p>During the community bonding period, the proposed work was modified to achieve optimized result from the summer. The finalized work was:</p> <ol> <li>Adding Linear Regression Model from scratch</li> <li>Adding Linear Regression and other proposed models using scikit-learn</li> <li>Adding tests for the added models</li> <li>Documenting the models</li> </ol> <p><strong>Tasks Completed:</strong></p> <ul> <li> <p>Added Linear Regression model from scratch with tests</p> <p>Simple Linear Regression model implemented from scratch. This was successfully completed with tests and documentation, and was also releasd on PyPI.</p> </li> <li> <p>Added scikit models with dynamic support Initially, it was planned to add certain number of models from scikit but as I did it with one model (Multiple Linear Regression with scikit), we decided to extend this and make a base for all scikit models and make other model classes dynamic. This was successful and now adding scikit models to DFFML is as easy as appending the model name to a python dictionary. The tests are complete and the documentation material is ready but we are still figuring out a more understandable way of documenting this before release.</p> </li> </ul> <p><strong>Future Work:</strong></p> <p>The project was started just before GSoC'19 and it has come a long way since. I plan on contributing significantly to the project after GSoC'19. Few of the planned stuff:</p> <ol> <li>Adding more scikit models</li> <li>Working on more machine learning libraries and add models</li> <li>Contruct DFFML Web UI from scratch which was conceptualized during summer and much more.</li> </ol> <p> </p> <p>More detailed report: https://gist.github.com/yashlamba/5e0845a6cd5a1198f166ddedfba78802</p> <p> </p>yashlamba2000@gmail.com (yashlamba)Mon, 26 Aug 2019 15:58:47 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/final-work-submission-and-future-work/Final week check-inhttps://blogs.python-gsoc.org/en/yashlambas-blog/final-week-check-in/<p><strong>Targets for this week:</strong></p> <p>Almost all my targets and extended goals were achieved before 19th August, this week I have been working on documentation for scikit models in dffml. We want to make the packages docs as clear as possible before release.</p> <p><strong>About the final evaluation:</strong></p> <p>I haven't yet submitted the final evaluation at the time writing this blog. I am waiting for documentation to get finished some how in the next couple days so that I can add it too in work chart. I think I have done my work well and I have bonded with project a lot that I think I will continue contributing to this project for a long time.</p> <p><strong>Challenges: </strong></p> <p>The challenge right now is thinking of the best way to document scikit model that we are brute forcing over discussions recently. I'll post couple more final blogs soon regarding contributing to larger projects as a beginner and probably one more about how to document your code.</p>yashlamba2000@gmail.com (yashlamba)Thu, 22 Aug 2019 16:45:09 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/final-week-check-in/Final Coding Week - Covering whatever is lefthttps://blogs.python-gsoc.org/en/yashlambas-blog/final-coding-week-covering-whatever-is-left/<p>Completed Work:</p> <p>Majority of the stuff I had proposed is complete now. I have successfully added the following models to dffml:</p> <ol> <li style="margin-left: 40px;">Linear Regression from Scratch (Released with complete documentation)</li> <li style="margin-left: 40px;">Scikit Models (Merged and waiting for documentation): <ul> <li style="margin-left: 40px;">https://github.com/intel/dffml/blob/3d04591dc664fcde1b9a95650b40fb76b6569abf/model/scikit/dffml_model_scikit/scikit_models.py#L46-L87</li> </ul> </li> </ol> <p>Targets for this week:</p> <p>Complete the documentation for scikit models and fix a few issues.</p> <p>Challenges:</p> <p>I anticipate quite a few challenges this week. As the model class creation for scikit is dynamic, the documentation in particular is going to be a tricky task. Making it understandable and readable for new users is a priority. I'll try to fill it examples for better understanding.</p>yashlamba2000@gmail.com (yashlamba)Wed, 14 Aug 2019 04:44:51 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/final-coding-week-covering-whatever-is-left/Contributing to DFFML, A guide to new contributorshttps://blogs.python-gsoc.org/en/yashlambas-blog/contributing-to-dffml-a-guide-to-new-contributors/<p>I wasn't myself a python expert or a machine learning when I started, all you need to have is some patience before contributing to any open source project. (I'll use dffml as a reference)</p> <p>Browse which project interests you first, this is the most important. You should understand why you want to contribute, is it something you have been using, does it have something you want to learn to do, is it a project assigned to you or something like this. Then first go about the README, read how to setup the project, go through the guidelines and set it up locally. This can be difficult, if you face some problems, ask on the channel probably on irc, slack, gitter, whatever the organisation uses without hesitating. Open source is open, so ask without fear.</p> <p>Once setup, you should go through the issues. If you are a beginner, there might be a label 'good first issue' or find something you can fix in the docs. Fix it according to the guidelines and open a pull request. It might be long that you have to wait for a review, be patient. Make changes if requested and boom you have made a contribution.</p> <p>This was a very beginner guide and I'll make sure to make an advanced contribution guide.</p>yashlamba2000@gmail.com (yashlamba)Tue, 06 Aug 2019 16:17:19 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/contributing-to-dffml-a-guide-to-new-contributors/Final Sprint to finish scikit and make it even easier to add modelshttps://blogs.python-gsoc.org/en/yashlambas-blog/final-sprint-to-finish-scikit-and-make-it-even-easier-to-add-models/<p>So far I have successfully added 2 models to dffml out which dffml-model-scratch (Linear regression) is already released and available on pypi. Other scikit model has been merged and is awaiting release.</p> <p>Doing the scikit made us realize that we can do it in better and a much faster way. We have now decided a different procedure to go about adding scikit classifiers such that we have minimum repeated code and adding classifiers is as simple as appending it to a dictionary.</p> <p>We have just planned this and I can't really assess how much time it would take, but I hope to complete it before the final evaluation. The models I would be adding can be found here: https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html</p> <p>Link to my first authored pypi package: https://pypi.org/project/dffml-model-scratch/</p>yashlamba2000@gmail.com (yashlamba)Thu, 01 Aug 2019 20:14:28 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/final-sprint-to-finish-scikit-and-make-it-even-easier-to-add-models/Getting started with new modelshttps://blogs.python-gsoc.org/en/yashlambas-blog/getting-started-with-new-models/<p><em><strong>What did I do this week?</strong></em></p> <p>This week was mostly about testing and debugging the scikit linear regression model. After that I implemented saving and loading of the model, which took some time is debugging itself. This work was completed by Friday and I spent the weekend studying some other models including k-Nearest Neighbors, K Means and learnt about support vector machines.</p> <p><em><strong>Did I get stuck?</strong></em></p> <p>Oh, at loads of places. Most surprising and funny was that I was receiving negative accuracy out of scikit model and I had absolutely no idea what it meant. Now I have got it and mentors are looking into how to we can make this work for DFFML.</p> <p>I got stuck in saving and loading too, as scikit offers saving and loading with pickle or joblib, I also had to save confidence of the model in a JSON that took tons of ideas and debugging.</p> <p><em><strong>Plans for upcoming week?</strong></em></p> <p>It'll probably be discussed in the weekly sync and previous two models would be merged.</p>yashlamba2000@gmail.com (yashlamba)Tue, 23 Jul 2019 03:33:55 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/getting-started-with-new-models/Finishing up Scikit Linear Regressionhttps://blogs.python-gsoc.org/en/yashlambas-blog/finishing-up-scikit-linear-regression/<p>I had an assigned task to make the requested changes and refactor the scratch linear regression model. The tests and documentation are complete and are ready to merge along with the model. I will now be working mostly on scikit models from now.<br> <br> Targets:</p> <p>1. Linear Regression<br> 2. k Nearest neighbours<br> <br> Challenges faced:<br> Challenges are mostly in making the code more understandable and neat. I have been following a protocol of implementing the algorithm in an external repo and then after discussing with mentor, wrapping it as neatly as possible.<br> Another challenge is debugging, there is no set way to check/debug the code so I have to simultaneously write tests and make sure they are meaningful. This is one of the reasons why scratch Linear Regression took so much time.</p> <p>Resource I am using:</p> <p>I really wanted to mention this and would probably write a separate blog on this but for now I am majorly following scikit documentation and an amazing python channel named 'sentdex' on youtube.</p>yashlamba2000@gmail.com (yashlamba)Wed, 17 Jul 2019 16:26:45 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/finishing-up-scikit-linear-regression/Coding and Communicationhttps://blogs.python-gsoc.org/en/yashlambas-blog/coding-and-communication/<p>You have to constantly be learning and implementing various things if you want to be better at programming. Watch some tutorials, checkout some blogs, read the documentation and start experimenting, I have learnt this from my past year experience that if you want to move ahead and be better at any stack of programming, don't get lost in the plethora of tutorials available, pick a project, experiment, un-learn - re-learn stuff and implement. This is by far the most valuable lesson I have learnt.</p> <p>Second most valuable key is communication. Don't be shy, ask questions because the most valuable teacher is experience and people who have it are the best resource of knowledge you can ask for. My mentor has taught me a lot things after starting GSoC, even learnt quite a few things from the other student working on the project and these lessons and connections will definitely go far ahead in my programming career.</p>yashlamba2000@gmail.com (yashlamba)Thu, 11 Jul 2019 17:15:56 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/coding-and-communication/Working with SciKit and meeting deadlineshttps://blogs.python-gsoc.org/en/yashlambas-blog/working-with-scikit-and-meeting-deadlines/<p>The first linear regression model took unexpectedly long that put me in a difficult place because now I have to cover up at least a couple more models before the second evaluation. My mentor and I decided to take up Scikit implementations now and I think they will be faster to implement and will give better results too.</p> <p>Task Assigned:- Complete Linear Regression and one more model (working on both KNN and Logistic regression) before 15th July.</p> <p>Current State: I ran into unexpected problems with payoneer and hadn't received the payment for my first evaluation up until now. This took a significant part of my time but task wise I completed the context work of Linear Regression with leaves me with testing and documentation only and for the other model, I am done studying and practice it's raw implementation and think I am ready once I am done with my Linear Regression models.</p> <p>Last week was unexpectedly slow and I probably would have to catch up with it.</p>yashlamba2000@gmail.com (yashlamba)Tue, 09 Jul 2019 00:44:01 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/working-with-scikit-and-meeting-deadlines/Week 3, Wrapping up Linear Regression from scratch!https://blogs.python-gsoc.org/en/yashlambas-blog/week-3-wrapping-up-linear-regression-from-scratch/<p><strong>What was my task?</strong></p> <p>This week I had to finalize Linear Regression model in DFFML. As my mentor and I had discussed and agreed upon giving the linear regression from scratch model a little more time, as it could help us make a new model tutorial later, I have been working on it for the past 3 of weeks that includes code, documentation, testing and implementing saving and loading besides the main algorithm.</p> <p><strong>Did I complete my task?</strong></p> <p>Yes, three days into the 3rd week I was done with the model and it was working fine until my mentor felt making models would be a lot easier if he made some changes in config of DFFML, so to merge my PR, I am waiting for those changes and simultaneously working on the next task which I will talk about in the next blog.</p> <p><strong>Challenges!</strong></p> <p>This week was rather not challenging as I had to make the requested changes that my mentor explained to me very well mostly. As DFFML is new, changes rather come up in the core code base sometimes that become challenging both for my mentor and both students involved in GSoC. But it is how projects work, we have a healthy discussion every week and solve every problem together.</p>yashlamba2000@gmail.com (yashlamba)Sun, 23 Jun 2019 18:28:27 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/week-3-wrapping-up-linear-regression-from-scratch/Week 2 Wrap up!https://blogs.python-gsoc.org/en/yashlambas-blog/week-2-wrap-up/<p><strong>What was my assigned task?</strong></p> <p>As I was working on adding linear regression model to dffml, as a subtask for week 2, I had to complete the implementation of the model so that I can work on saving and loading as well as testing the subsequent week.</p> <p><strong>Did I complete it? Was it according to my proposal?</strong></p> <p>I did complete my assigned task but it wasn't according to my proposal. I had to complete OLSR by week two according to my proposal but after a prior discussion with my mentor about how we need a new "new model tutorial" in the docs so that new contributors can understand the whole procedure, I implemented OLSR from scratch and will have to work on extensive documentation.</p>yashlamba2000@gmail.com (yashlamba)Thu, 13 Jun 2019 15:23:52 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/week-2-wrap-up/Read, Code, StackOverflow, Commit, Repeathttps://blogs.python-gsoc.org/en/yashlambas-blog/read-code-stackoverflow-commit-repeat/<p>Having been assigned a task that sometimes feels tough really pushes you to learn more. I have been assigned such a task and believe me, I feel lost after every few minutes. Working with newer things like context management, machine learning models etc, I spend most of my time learning new stuff rather than coding and its been amazing. I have learnt so much that I couldn't have imagined throughout my summers. I guess week 2 is going to be filled with more of code part rather than learning.</p> <p>One more thing that I felt was awesome, is having conversations with people. I have had amazing conversations with my mentor as well as other co-students and all of them are so helpful and understanding. GSoC till now has been an amazing experience in every way.</p>yashlamba2000@gmail.com (yashlamba)Mon, 03 Jun 2019 08:10:22 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/read-code-stackoverflow-commit-repeat/Getting Started with GSoC'19 Project!https://blogs.python-gsoc.org/en/yashlambas-blog/getting-started-with-gsoc-19-project/<p>As the community bonding period ended recently, I started working on my selected project for GSoC but who knew that working on a project, which is out of your comfortable domain will be this demanding.</p> <p>I was intrigued by the idea of <strong>DFFML</strong>. Before GSoC, I worked on sources (a part of DFFML that deals with data sources), but that work was rendered complete with the help of my mentor and another fellow student. It required knowledge of basic python and testing. So now for GSoC'19, I had to think of a new project. After a meeting with my mentor, he suggested me to work on adding machine learning models to DFFML, but machine learning was way out of my knowledge boundaries. But then, I stumbled upon an article about how GSoC helps you learn new things not implement what you already are well versed with. So I prepared a proposal and refined it with my mentor.</p> <p>After getting selected I knew, I had a lot of work to do. I started from <strong>async functions</strong> and refined my python knowledge, then my college exams started that lead to a little break. But during the exams I had a meeting with my mentor to decide the checkpoints and he was super helpful and supportive for the project. After the exams ended (the current period) I am working on libraries like <strong>numpy, pandas</strong> and have a goal of implementing <strong>linear regression</strong> from scratch.</p> <p> </p>yashlamba2000@gmail.com (yashlamba)Wed, 29 May 2019 07:08:12 +0000https://blogs.python-gsoc.org/en/yashlambas-blog/getting-started-with-gsoc-19-project/