What did you do this week?
This week I documented my final project report, got it reviewed from mentors and implemented the changes.
What is coming up next?
Contributing more to the code-base
Did you get stuck anywhere?
No
This week I documented my final project report, got it reviewed from mentors and implemented the changes.
Contributing more to the code-base
No
This week I worked on a streaming source dfpreprocess (which would be renamed as df) with the help of Mentor John, I also worked on some example datasets which we can use and create documentation on how we can use the cleanup operations for data cleaning within the dataflow.
Next I would be working on creating documentations and getting pull request reviewed.
Yes I was stuck with the streaming source, with the help of mentor John, we were able to solve the issues.
This week I was looking into the issue of how we can use the input layer for getting all the records data and converting that into a matrix and do further processing on top of it.
I would work on dataset examples and document how we can use the clean up operations for data cleaning purposes and how we can use that with training , testing parts.
I was stuck with one of the operations, I was throwing the error operation not instantiable. I looked into the code and tried to figure out what was the reason behind the issue.
This week I worked on refactoring the the data cleanup operations, I made a mistake was not aware that we had a script which generates the python package template for us. Worked on refactoring the code. Also started working on creating an example on how to use the data clean up operations using a dataset from the real world.
Next I will be implementing two more examples on how we can work with cleaning of dataset within the dataflow itself.
Yes I was needed help with how we can pass List[List[int]] type data to the cleanup operations which works on each records and not on whole dataset, Figured out we have to implement operation Implementation class for it.
This week I did documentation related to the sklearn scorers that I had implemented earlier. I also worked on the dataflow create short command, I worked on how to parse Dict fields in the command line for the dataflow short command, I also made progress on the create short command and was able to parse the operations and the inputs.
I would complete the implementation of the dataflow create short command.
Yes I was stuck with the implemention of the create short command, was able to figure out things by looking into the code and reading a lot and understanding how dataflows work.