seraphimstreets's Blog

Weekly Blog Post #2

seraphimstreets
Published: 06/27/2022

What did you do this week?

My original PR was too large and disorganized, so I split it up into several PRs, one for the tune CLI, and one for the feature engineering. I hoped to have some of the PRs reviewed, but unfortunately none of my mentors were available this week. Besides that, I also started taking a look at DFFML’s CI tests which have been failing for a while now, and have found likely solutions for some of them.

What is coming up next?

As per the advice of one of my mentors, Hashim, I will be creating unit tests/tutorials for DFFML’s new tuning functionality and push it in the next commit. I will also continue studying the failing CI tests and hopefully help resolve them.

Did you get stuck anywhere?

I’ve been having trouble with the local CI testing. I admit I don’t really know much about CI, and I’m not really sure whether the runtime errors I’ve encountered are a result of using Windows Subsystem for Linux, or how to best configure it. I’ll try to do more research to see how these issues can be resolved.

View Blog Post

Weekly Blog Post #1 [Jun. 18, 2022]

seraphimstreets
Published: 06/17/2022

Hi everyone, my name is Edison Siow and I will be contributing to DFFML under the banner of Python GSOC. My project will be to implement AutoML in the DFFML library, by utilizing a set of hyperparameter tuning/feature engineering techniques.

What did you do this week?

I began work on the first step outlined in my proposal, which was to create the requisite methods and classes necessary to implement CLI ‘tune’ command for a given model and tuner. To begin, I used the existing parameter-grid (grid search) tuner and XGBClassifier model for testing. After closely studying the codebase, I managed to get a working tune command, although I am concerned that it may not conform to the coding practices of DFFML and will be consulting with my mentor this weekend. In the meantime, I moved on to the next step, which was to extend this functionality to more tuners and models. I implemented modules for random search and Bayesian optimization with gaussian processes, and confirmed their compatibility with a few more DFFML models (XGBRegressor, scikit models, Pytorch models) with a variety of different datasets.

What is coming up next?

After consulting with my mentors this weekend, I hope to iron out any inconsistencies and anti-patterns in my code in preparation for a proper commit.  Following that, I hope to test my existing pool of tuners on a wider range of DFFML models.

Did you get stuck anywhere?

I faced difficulties at various points of the aforementioned process, but managed to work through them eventually. At this point, I am more concerned that my code may contain antipatterns due to my lack of experience.

View Blog Post