Hi everyone, my name is Edison Siow and I will be contributing to DFFML under the banner of Python GSOC. My project will be to implement AutoML in the DFFML library, by utilizing a set of hyperparameter tuning/feature engineering techniques.
What did you do this week?
I began work on the first step outlined in my proposal, which was to create the requisite methods and classes necessary to implement CLI ‘tune’ command for a given model and tuner. To begin, I used the existing parameter-grid (grid search) tuner and XGBClassifier model for testing. After closely studying the codebase, I managed to get a working tune command, although I am concerned that it may not conform to the coding practices of DFFML and will be consulting with my mentor this weekend. In the meantime, I moved on to the next step, which was to extend this functionality to more tuners and models. I implemented modules for random search and Bayesian optimization with gaussian processes, and confirmed their compatibility with a few more DFFML models (XGBRegressor, scikit models, Pytorch models) with a variety of different datasets.
What is coming up next?
After consulting with my mentors this weekend, I hope to iron out any inconsistencies and anti-patterns in my code in preparation for a proper commit. Following that, I hope to test my existing pool of tuners on a wider range of DFFML models.
Did you get stuck anywhere?
I faced difficulties at various points of the aforementioned process, but managed to work through them eventually. At this point, I am more concerned that my code may contain antipatterns due to my lack of experience.