Articles on seraphimstreets's Bloghttps://blogs.python-gsoc.orgUpdates on different articles published on seraphimstreets's BlogenSun, 04 Sep 2022 23:49:31 +0000Weekly Blog Post #12https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-12-4/<p class="cms-plugin"><span style=""><span style=""><b><u>What did you do this week?</u></b></span></span></p> <p class="cms-plugin"><span style=""><span style="">I met up with my mentor Hashim to discuss the automl branch. He pointed out some mistakes I made, which I have corrected and pushed to the latest commit. </span></span></p> <p class="cms-plugin"><span style=""><span style=""><b><u>What is coming up next?</u></b></span></span></p> <p class="cms-plugin"><span style=""><span style="">Since this is the final week, I would hope to meet up with the rest of my mentors to tie up any loose ends and mark the completion of the GSOC project. </span></span></p> <p class="cms-plugin"><span style=""><span style=""><b><u>Did you get stuck anywhere?</u></b></span></span></p> <p style=""><span style=""><span style=""><span style=""><span style=""><span style=""><span style="">Not really, there was slight trouble with the lack of a validation dataset for the tune function, but the current solution (splitting using sklearn) should be fine.</span></span></span></span></span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Sun, 04 Sep 2022 23:49:31 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-12-4/Weekly Blog Post #11https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-11-1/<p style=""><span style=""><span style=""><span style=""><span style=""><span style="">Apologies for the late blog post, school semester was hectic with numerous deadlines. </span></span></span></span></span></p> <p style=""><span style=""><span style=""><span style=""><b><u><span style=""><span style="">What did you do this week?</span></span></u></b></span></span></span></p> <p style=""><span style=""><span style=""><span style=""><span style=""><span style="">On the off time I had, attempted creating the ensemble model, to no avail unfortunately. Since it’s a stretch goal, I focused on schoolwork instead.</span></span></span></span></span></p> <p style=""><span style=""><span style=""><span style=""><b><u><span style=""><span style="">What is coming up next?</span></span></u></b></span></span></span></p> <p style=""><span style=""><span style=""><span style=""><span style=""><span style="">Since the final evaluation deadline is about a week away, I hope to get the final code review from mentors and tie up any loose ends in the  project.</span></span></span></span></span></p> <p style=""><span style=""><span style=""><span style=""><b><u><span style=""><span style="">Did you get stuck anywhere?</span></span></u></b></span></span></span></p> <p style=""><span style=""><span style=""><span style=""><span style=""><span style="">I was stuck at the ensemble model, which I guess I will put aside time as the deadline approaches. However, I could continue working on it after the GSOC period unofficially.  </span></span></span></span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Thu, 01 Sep 2022 13:48:12 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-11-1/Weekly Blog Post #10https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-10-4/<p class="cms-plugin" style=""><span style=""><span style=""><b><u><span style="">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style=""><span style=""><span style="">Finalized the default/user-defined hyperparameters functionality for the AutoML model, added unit tests and a page of documentation. </span></span></p> <p class="cms-plugin" style=""><span style=""><span style=""><b><u><span style="">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style=""><span style=""><span style="">I will be adding more unit tests to validate the performance of the model , and also explore the optional ensemble functionality. I’ve also requested a review from my mentors, will make changes if necessary if they request it.</span></span></p> <p class="cms-plugin" style=""><span style=""><span style=""><b><u><span style="">Did you get stuck anywhere?</span></u></b></span></span></p> <p class="cms-plugin" style=""><span style=""><span style="">Default hyperparameters are a little troublesome, since they may not work with certain tuners, ie. Bayes optimization. Will consult with mentors to decide on how to handle that. </span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Mon, 22 Aug 2022 11:38:30 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-10-4/Weekly Blog Post #9https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-9-1/<p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I’ve added the ability to pass hyperparameters to the AutoML model, and formulated some default values in case we decide to allow default hyperparameters for the final model. Besides that, progress is a little slow as my school semester has started.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I will continue watching closely for any feedback regarding the direction I should take the AutoML models’ tuning. Furthermore, I will be begin adding documentation for the AutoML model.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">Did you get stuck anywhere?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>Not really, hopefully it stays that way as I reach the final stretch.  </span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Sun, 14 Aug 2022 15:23:37 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-9-1/Weekly Blog Post #8https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-8-4/<p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I completed the first iteration of the AutoML model and pushed it onto a new branch. The current model is able to take user-defined train/test datasets, and iterate over a list of user-defined models to return the best performing model. The tuning functionality has yet to be added as I am uncertain of the best way to approach tuning and await my mentor’s feedback. </span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I will be watching closely for any feedback regarding the direction I should take the AutoML models’ tuning. Otherwise, I will assume that model should be able to accept a user-defined set of hyperparameters, and work on incorporating that into the existing iteration. </span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">Did you get stuck anywhere?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>It was certainly challenging to get the AutoML model running as it is quite different from other existing DFFML models, but I managed to in the end and the rest should be quite smooth-sailing from here. </span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Sun, 07 Aug 2022 22:49:24 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-8-4/Weekly Blog Post #7https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-7-1/<p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I finished and pushed the requested changes to the PR as requested by John. I’ve also been working on the prototype for the AutoML model, but progress is a little slower this week as I had to do handle some administrative matters for my next school semester.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>Working on the AutoML prototype, hopefully it will be done at the end of the week.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">Did you get stuck anywhere?</span></u></b></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: 107%;"><span style='font-family: "Calibri",sans-serif;'><span style="font-size: 12.0pt;"><span style="line-height: 107%;"><span style='font-family: "Times New Roman",serif;'>I’m a little unsure about the exact specifications of the AutoML model (how DFFML envisages its usage), will clarify with mentors. </span></span></span></span></span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Mon, 01 Aug 2022 07:00:37 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-7-1/Weekly Blog Post #6https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-6-5/<p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I had my PRs reviewed by my mentors John and Saahil this weekend, who largely affirmed my work and requested certain changes to be made. Besides that, I have been working on the first iteration of the AutoML model and planning for the second phase of the project.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I will modify my PRs to adhere to the requests of my mentors. I will also continue working on the AutoML model, and hopefully have a working prototype come end of this or next week. </span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">Did you get stuck anywhere?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>Not really, though John pointed out some important details during the review that I had missed, and will keep in mind for future open-source projects.</span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Mon, 25 Jul 2022 05:42:31 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-6-5/Weekly Blog Post #5https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-5-4/<p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I finished the documentation pages for the tuning functionality, finishing my requirements for the midterm milestone. I also received feedback on the tunecli module from my mentor, Hashim, who seemed largely positive about the PR, with some minor modifications and design decisions to be changed.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I’m currently waiting for the final commit on my PR, as well as the other small feature engineering operations PR to be approved, to complete the midway milestone. Beyond that, I will begin creating the automl model, incorporating the feature engineering operations and tune functionality.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">Did you get stuck anywhere?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>Not really. Many of my mentors are ill or otherwise occupied, but otherwise, I am hopeful that we can make progress on approving the PR soon. </span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Sun, 17 Jul 2022 19:47:42 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-5-4/Weekly Blog Post #4https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-4-6/<p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I added two new tuners to the tunecli PR and two new unit tests,  one for vowpalWabbit and tensorflow. </span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I’ll be creating documentation pages for tuning functionality and the parameter-grid/bayes_opt_gp Tuners.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">Did you get stuck anywhere?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>Not really, I am just awaiting a PR reviews. One of my mentors, Hashim, is working on a review for the tunecli PR, which is quite lengthy.</span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Mon, 11 Jul 2022 03:43:16 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-4-6/Weekly Blog Post #3https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-3-3/<p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>I’ve created unit tests for several DFFML models for the tune CLI, and met with a mentor to discuss the tune CLI PR.  </span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>Since I’ve created sufficient unit tests to meet the midway milestone goals, I will be working on documentation of the new tuning functionality while I await review/approval of my PRs.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">Did you get stuck anywhere?</span></u></b></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: 107%;"><span style='font-family: "Calibri",sans-serif;'><span style="font-size: 12.0pt;"><span style="line-height: 107%;"><span style='font-family: "Times New Roman",serif;'>I’m hoping that my mentors will soon be free to review/accept my PRs, so that I may move onto the next step of incorporating additional tuners.  </span></span></span></span></span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Sun, 03 Jul 2022 19:14:27 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-3-3/Weekly Blog Post #2https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-2-5/<p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What did you do this week?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>My original PR was too large and disorganized, so I split it up into several PRs, one for the tune CLI, and one for the feature engineering. I hoped to have some of the PRs reviewed, but unfortunately none of my mentors were available this week. Besides that, I also started taking a look at DFFML’s CI tests which have been failing for a while now, and have found likely solutions for some of them.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">What is coming up next?</span></u></b></span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'>As per the advice of one of my mentors, Hashim, I will be creating unit tests/tutorials for DFFML’s new tuning functionality and push it in the next commit. I will also continue studying the failing CI tests and hopefully help resolve them.</span></span></p> <p class="cms-plugin" style="margin-bottom: 11px;"><span style="font-size: 12pt;"><span style='font-family: "Times New Roman",serif;'><b><u><span style="font-size: 13.5pt;">Did you get stuck anywhere?</span></u></b></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: 107%;"><span style='font-family: "Calibri",sans-serif;'><span style="font-size: 12.0pt;"><span style="line-height: 107%;"><span style='font-family: "Times New Roman",serif;'>I’ve been having trouble with the local CI testing. I admit I don’t really know much about CI, and I’m not really sure whether the runtime errors I’ve encountered are a result of using Windows Subsystem for Linux, or how to best configure it. I’ll try to do more research to see how these issues can be resolved. </span></span></span></span></span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Mon, 27 Jun 2022 07:09:09 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-2-5/Weekly Blog Post #1 [Jun. 18, 2022]https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-1-jun-18-2022/<p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: normal;"><span style='font-family: "Calibri",sans-serif;'><span style="font-size: 12.0pt;"><span style='font-family: "Times New Roman",serif;'>Hi everyone, my name is Edison Siow and I will be contributing to DFFML under the banner of Python GSOC. My project will be to implement AutoML in the DFFML library, by utilizing a set of hyperparameter tuning/feature engineering techniques. </span></span></span></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: normal;"><span style='font-family: "Calibri",sans-serif;'><b><u><span style="font-size: 13.5pt;"><span style='font-family: "Times New Roman",serif;'>What did you do this week?</span></span></u></b></span></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: normal;"><span style='font-family: "Calibri",sans-serif;'><span style="font-size: 12.0pt;"><span style='font-family: "Times New Roman",serif;'>I began work on the first step outlined in my proposal, which was to create the requisite methods and classes necessary to implement CLI ‘tune’ command for a given model and tuner. To begin, I used the existing parameter-grid (grid search) tuner and XGBClassifier model for testing. After closely studying the codebase, I managed to get a working tune command, although I am concerned that it may not conform to the coding practices of DFFML and will be consulting with my mentor this weekend. In the meantime, I moved on to the next step, which was to extend this functionality to more tuners and models. I implemented modules for random search and Bayesian optimization with gaussian processes, and confirmed their compatibility with a few more DFFML models (XGBRegressor, scikit models, Pytorch models) with a variety of different datasets. </span></span></span></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: normal;"><span style='font-family: "Calibri",sans-serif;'><b><u><span style="font-size: 13.5pt;"><span style='font-family: "Times New Roman",serif;'>What is coming up next?</span></span></u></b></span></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: normal;"><span style='font-family: "Calibri",sans-serif;'><span style="font-size: 12.0pt;"><span style='font-family: "Times New Roman",serif;'>After consulting with my mentors this weekend, I hope to iron out any inconsistencies and anti-patterns in my code in preparation for a proper commit.  Following that, I hope to test my existing pool of tuners on a wider range of DFFML models.</span></span></span></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: normal;"><span style='font-family: "Calibri",sans-serif;'><b><u><span style="font-size: 13.5pt;"><span style='font-family: "Times New Roman",serif;'>Did you get stuck anywhere?</span></span></u></b></span></span></span></p> <p style="margin-bottom: 11px;"><span style="font-size: 11pt;"><span style="line-height: 107%;"><span style='font-family: "Calibri",sans-serif;'><span style="font-size: 12.0pt;"><span style="line-height: 107%;"><span style='font-family: "Times New Roman",serif;'>I faced difficulties at various points of the aforementioned process, but managed to work through them eventually. At this point, I am more concerned that my code may contain antipatterns due to my lack of experience. </span></span></span></span></span></span></p>edisonsiowxiong@gmail.com (seraphimstreets)Fri, 17 Jun 2022 19:20:03 +0000https://blogs.python-gsoc.org/en/seraphimstreetss-blog/weekly-blog-post-1-jun-18-2022/