Articles on vipulgupta2048's Bloghttps://blogs.python-gsoc.orgUpdates on different articles published on vipulgupta2048's BlogenFri, 23 Aug 2019 03:21:54 +0000Return GSoC; // Week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/return-gsoc-week-that-has-been-2048/<p>We are done,</p> <p>Firstly checkout the pull request for the work product - <a href="https://github.com/scrapinghub/spidermon/pull/201">https://github.com/scrapinghub/spidermon/pull/201</a></p> <p>Project Repo - https://github.com/vipulgupta2048/mygsoc</p> <p>All tasks have been completed as per project proposal. </p> <p>Cerberus validation library has now been integrated with Spidermon and its validation pipelines. Where users would be able to test their data items on custom schemas defined by them easily and with or no configuration. </p> <p>It brings me great joy to end on a fulfilling note for contributing to Spidermon and the Scrapy Project as part of Google Summer of Code 2019, I am happy and content with the work produced. </p> <p>The PR includes,</p> <ul> <li>CerberusValidator() class for item validation through Cerberus.</li> <li>Translator for translating errors for a better, unified system working with other validation methods.</li> <li>Complete integration with Scrapy pipelines, working with raw schema, URL's, and paths.</li> <li>Unit + integration tests for each component in place.</li> <li>Documentation for Cerberus Validation method.</li> </ul> <p>For system testing, one could go ahead and use the pre-configured Quotes spider <a href="https://github.com/vipulgupta2048/testing_quotes">https://github.com/vipulgupta2048/testing_quotes</a> and installing Spidermon from the master branch of my fork.</p> <p>This project has been completed with long nights of reading and writing the code, learning new concepts on the fly and asking hundreds of pop-questions on Slack, that were answered duly by my mentors <a href="https://github.com/ejulio">@ejulio</a> <a href="https://github.com/rennerocha">@rennerocha</a> as without their constant help, motivation, and guidance completing this uphill task wouldn't be ever possible.</p> <p>Thank you all for reading, </p> <p>You can check out more blogs here - <a href="https://mixstersite.wordpress.com/gsoc/">https://mixstersite.wordpress.com/gsoc/</a></p>vipulgupta2048@gmail.com (vipulgupta2048)Fri, 23 Aug 2019 03:21:54 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/return-gsoc-week-that-has-been-2048/The week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048-5/<p><strong>Week #12 7/08 13/08</strong></p> <p><b>Well, as far as the flow goes CerberusValidator works with schemas that are in Mapping structure. Basically any dicts with values as dict having types of values. If you don’t get it, then check this out <a href="https://docs.python-cerberus.org/en/stable/">https://docs.python-cerberus.org/en/stable/</a></b></p> <p><b>But, Cerberus only cares for the schema and data which its getting from the user. Not from where it gets it. Almost most of our users will be giving the schema in the from either URL or paths to files. Which is fine by us until the point somewhere in week 12 where I forgot to code that properly into the code. Nothing to be afraid, had to redo some old functions. Actually improved a lot of old code in the process. How time flies by. Damn.</b></p> <p><b>Not much is left to be done, except write a few more tests, and a lot of testing. And merging it to master. I am confident we can make it before August 19. Let’s see. Fingers crossed. This is vipulgupta2048 signing off for the second last time here. I won’t be going anywhere if you think. </b><br>  </p> <p><b>There is a lot of work to be done at ScrapingHub x The Scrapy Project. <br> Looking forward to new challenges. <br> Check progress on --&gt; </b><a href="https://github.com/vipulgupta2048/spidermon/projects/1">https://github.com/vipulgupta2048/spidermon/projects/1</a></p>vipulgupta2048@gmail.com (vipulgupta2048)Mon, 12 Aug 2019 11:25:38 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048-5/The week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048-4/<p>&lt;meta charset="utf-8"&gt;</p> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">Week #11 31/08 to 06/08</b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">What did you do this week?</b></h1> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">I wrote my docs. I broke all the tests. I shifted quite a lot of code around. </b></p> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">And now, I am fixing it all up. Thank God for git!</b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">Documentation is critical for any open-source project. And as an avid documentation writer I have a lot of experience writing docs. It’s something I feel good doing. Cerberus docs are no different. I worked on 3 PR’s this week, </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">#5 being the old Cerberus Integration PR, whose tests are still being written - <a href="https://github.com/vipulgupta2048/spidermon/pull/5">https://github.com/vipulgupta2048/spidermon/pull/5</a> </b></p> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">#6 being the Docs PR - <a href="https://github.com/vipulgupta2048/spidermon/pull/6">https://github.com/vipulgupta2048/spidermon/pull/6</a></b></p> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">#500 being the Cerberus PR which I opened long ago to add new examples to the Cerberus documentation - <a href="https://github.com/pyeve/cerberus/issues/500#issuecomment-507715444">https://github.com/pyeve/cerberus/issues/500</a></b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">Working on full steam ahead for the last week. </b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">What is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">Thankfully, just 13 more tasks. Well, I am somewhat of a over-enthusiastic person when it comes to opening project cards. So, I have a lot of personal work to be done. </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">Get all the latest updates here - <a href="https://github.com/vipulgupta2048/spidermon/projects/1">https://github.com/vipulgupta2048/spidermon/projects/1</a></b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-6ed140ca-7fff-e8b3-7203-2aff80ac0682">Pytest-mock took a lot of understanding strangely. I still don’t get it. Not for long, not for long. </b></p> <p> </p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 06 Aug 2019 14:29:41 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048-4/We are in the endgame NOW @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/we-are-in-the-endgame-now-2048/<p> </p> <p>&lt;meta charset="utf-8"&gt;</p> <p dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">Week #10 24/07 to 30/07</b></p> <p>Well, only 2 weeks and some days left to go. Oh boy, the time it has been. I wish to keep working if they let me. <br>  </p> <h1 dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">What did you do this week?</b></h1> <p dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">Integration finally worked out !! You know what that means? That mean, my project is almost complete. </b>&lt;meta charset="utf-8"&gt;<b id="docs-internal-guid-bbafdbbc-7fff-a300-c1b0-16c65bc07994">Here’s an informal take on how the week went, it was bumpy codewise, but we made it through to this outcome. </b></p> <p dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">To be very frank, Julio. I haven't had my fair share of practice with comprehensions in Python and this took a minute to figure out as did the entire test_pipelines.py and pipelines.pywhich took days to get through. This isn't complex Python, it's good code but there is just so much going on and I am not sure if the tests that I created are the best possible because I kept going back and forth between the code not able to figure what is the output from what function in this part of the code. As one can't just throw logging statements and run the file or project that we normally do. And I just wanted to do it on my own at that point, because I thought a bit more effort into this last bit and things might get clearer. And they did. I am happy that I did the work that was needed.</b></p> <p dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">At one point on the Sunday night, I just gave up and initialized the ItemValidationPipeline(), imported everything just to see what was going on line by line. Good hunting. I am happy that it worked out (Cerberus Integration), but not happy with the tests and would like to make it better. Codewise.</b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">What is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">We are left with unittests for the pipelines Cerberus integrated bit, documentation for the features and last but not least system testing. Here’s an informal take on how the week went. </b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">I am not sure this question brings me joy to answer. </b></p> <p dir="ltr"><b id="docs-internal-guid-d2d6b534-7fff-8bb0-35de-35b36ebdf943">So, I say yes!! I got stuck in a lot of places around this weekend. But, I am proud to say with the guidance of my mentors and some of my will. The confidence to debug the lines of code written this week was never broken, and will never be broken. Thank you everyone who helped!</b></p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 30 Jul 2019 16:55:04 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/we-are-in-the-endgame-now-2048/The week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048-3/<p> </p> <p>&lt;meta charset="utf-8"&gt;</p> <p dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">Week #9 17/07 to 23/07</b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">Well, integration isn’t working out, and neither am I giving up. Also, Evaludation 2 coming up!</b></p> <h1 dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">What did you do this week?</b></h1> <p dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">Well, another week another PR merged. Do those PR’s count that you merge yourself - <a href="https://github.com/vipulgupta2048/spidermon/pull/4">https://github.com/vipulgupta2048/spidermon/pull/4</a>, Translator has now been officially completed. </b></p> <p dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">Over to integration, the ride has been quite bumpy. As Cerberus is not being detected by the pipeline, no worries. We have settled on a methodology to solve this problem. We will first check if CerberusValidator works in the ItemValidationPipeline if it does then Spidermon works. And then we will start worrying why it doesn’t work in other places. </b></p> <p dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">Oh and I found a bug - <a href="https://github.com/scrapinghub/spidermon/issues/192">https://github.com/scrapinghub/spidermon/issues/192</a></b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">What is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">For now, if you like to know. We will be completing the ItemValidationPipeline, then moving onto integration which rounds this project up successfully. </b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-6a08b2ec-7fff-63c1-c29d-71153c61a64c">Don’t even ask, I somehow wasn’t able to install Scrapy in my first go (didn’t read the docs) and couldn’t implement the JSONSchemaValidator (didn’t read the docs enough times, with a magnifying glass). So yeah, bumpy. </b></p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 23 Jul 2019 14:22:56 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048-3/The week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048-1/<p><br>  </p> <p><a></a> <b>Week #8 10/07 to 16/07</b></p> <p><br>  </p> <p><a></a> <b>I just realised that there aren’t many weeks left. Good times like these should never end. </b></p> <p><br>  </p> <h1 class="western"><a></a> <b>What did you do this week?</b></h1> <p><a></a> <b>Good news, my PR for the validator has finally been merged. I am proud of it, great things coming forward → <a href="https://github.com/vipulgupta2048/spidermon/pull/2">https://github.com/vipulgupta2048/spidermon/pull/2</a></b></p> <p><br>  </p> <p><a></a> <b>Worked on finishing up the Translator as well, we had a change in direction in how we are going ahead on writing the tests for that class. I feel with the guidance of Julio especially on figuring out how to think about writing better tests really helped me out. Also, something extremely useful that I realized with using TDD in my thinking and coding is that while testing only, I find several edge cases that I never would have thought about. Check this out.</b></p> <p><br>  </p> <p><a></a> <b>*After testing*</b></p> <p><a></a> <b>&gt; r"^required field$":messages.MISSING_REQUIRED_FIELD,</b></p> <p><a></a> <b>This message allows only "required field" string to be passed. Which is what is needed, and works great. Here's the catch</b></p> <p><br>  </p> <p><a></a> <b>*Earlier before testing I had,*</b></p> <p><a></a> <b>&gt; r"required field":messages.MISSING_REQUIRED_FIELD,</b></p> <p><a></a> <b>Which led to the passing of all these string as well.</b></p> <p><a></a> <b>-  "not found required field"</b></p> <p><a></a> <b>- "aa required field aa"</b></p> <p><a></a> <b>- "required field almost anything here" and they all were getting translated.</b></p> <p><br>  </p> <p><a></a> <b>Without testing, this would have lead to all kinds of troubles and TypeErrors. I am thankful to say the least, that testing has now become an integral part of my development work. Hence, the quote </b></p> <p><br>  </p> <pre class="western" style="text-align: center;"><a></a><b>Good things happen when we test.</b> <a></a><b>- Vipul Gupta (2019-20)</b> </pre> <p><br>  </p> <h1 class="western"><a></a> <b>What is coming up next? </b></h1> <p><a></a> <b>Start with the refactoring of the itemvalidation pipelines. Since that’s a more important task in hand. And now is priority one for the Team Cerber</b></p> <p><br>  </p> <p><a></a> <b>Here’s the big feature missing that I will also be tackling.</b></p> <pre class="western"><a></a><b>Schema = {'quotes': {'type': ['string', 'list'], 'schema': {'type': 'string'}}}</b> <a></a><b>Data = {'quotes': [1, 'Heureka!']}</b> </pre> <p><a></a> <b>Error found while testing - </b></p> <pre class="western"><a></a><b>TypeError: {'quotes': [{0: ['must be of string type']}]}</b> </pre> <p><br>  </p> <p><a></a> <b>About this, this is _something special_ with Cerberus.</b></p> <p><a></a> <b>*Reference* - <a href="https://docs.python-cerberus.org/en/stable/validation-rules.html#type">https://docs.python-cerberus.org/en/stable/validation-rules.html#type</a> </b></p> <p><a></a> <b>*Context* - To introduce some diversity into the tests, I added this type of schemas where you can have multiple parameters set to as `type` to your values, the `schema` key governs what actually would be the type of your internal schema.</b></p> <p><a></a> <b>*About the Error* - I actually added the comments there, because the error we are getting is actually a parsing problem with the `Validator.py` parent class.</b></p> <p><a></a> <b>Usually, the errors we get and we parse are in the form of {field_name:message}, but here we are getting {field_name: {Array_element: message}} which I think is causing a typeError and something previous developers didn't account for since they never saw it coming with Cerberus. Cerberus is pretty good at showing detailed errors, hence I mentioned something related to not adding all the messages into the translator. But, this is something good that we caught here, as it would have never fit our use case in the future... Well, at least that's what my theory is.</b></p> <p><br>  </p> <h1 class="western"><a></a> <b>Did you get stuck anywhere?</b></h1> <p><a></a> <b>Yep, and I have been communicating a lot more with my nimble questions. I feel quite better asking, answering and discussing problem. Glad to figure that one out from my 1st eval. Quite happy with the work that’s happening as well. </b></p> <p style="margin-bottom: 0in; line-height: 100%;"> </p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 16 Jul 2019 14:31:24 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048-1/The week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048/<p> </p> <p><meta charset="utf-8"></p> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">Week #7 03/07 to 09/07</b></p> <h1 dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">What did you do this week?</b></h1> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">I tested. A LOT!</b></p> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">Well, this week I have been testing, refactoring and rethinking quite a lot of components for both the validator as well as the translator. By rethinking components, I mean, I rewrote the same 100 lines of code of the validator, about three times now. Improving it so much that. Git almost shows the changes as upward of 70% on every commit. There are some great changes being suggested from the detailed reviews of Renne and Julio on my pull request. I feel that I know quite a lot of new things about Cerberus and it’s working. </b></p> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">(Vipul is content with the progress, and lately, the mentors are too so everyone is happy.)</b></p> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">We are getting close with the validator, almost mergeable. Check out the PR here, let us know what more we could be doing - <a href="https://github.com/vipulgupta2048/spidermon/pull/2">https://github.com/vipulgupta2048/spidermon/pull/2</a> </b></p> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">The translator is as ready as it can get. We have to just keep on writing unit tests for it and adding new messages for the errors be passed through it well. </b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">What is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">We will be finishing the translator this week only, and starting with the refactoring of the itemvalidation pipelines as soon as possible. Since that’s a more important task in hand. </b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">I have been asking several mini-questions to my mentors regarding code, best practices, how is the best way to get X done. I aim for them to take less time as possible. I think, it's working because nowadays I feel I am more commited to the project and able to get a lot more done. And that’s a good thing. At least for me. </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">That’s about it, thank you for reading. How about this time, we have some comments to see if these small posts are even read to the end. I do make a good effort in making them fun. Writing is something that I enjoy doing. </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-3ec70b01-7fff-02ac-28c7-621f2b78ca90">This is vipulgupta2048 signing out, don’t forget to comment!</b></p> <p dir="ltr"> </p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 09 Jul 2019 16:39:06 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/the-week-that-has-been-2048/1st Eval, Mistake working remotely, and the special week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/1st-eval-mistake-working-remotely-and-the-special-week-that-has-been-2048/<p> </p> <p><meta charset="utf-8"></p> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">Week #6 26/06 to 02/07</b></p> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">Well, I survived the first evaluation as you can all see. Made some mistakes along the way, recovered with the advice from my mentors and hopefully going strong into work period 2. Let’s talk shop, yes. </b></p> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">Since this is a special blog. I will be asking the questions... and going to answer them. </b></p> <h1> </h1> <h1 dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">What did you do this week and what is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">I worked. Most people take the week off in the evaluation week. But, I know one thing for certain that when the college reopens in July then the pressure will start to pile up a bit. And, these days really can help make a difference in tough weeks ahead or easy sailing. Sharing some experience I had from GSoC 2018. </b></p> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">If you take a look here - <a href="https://github.com/vipulgupta2048/spidermon/projects/1">https://github.com/vipulgupta2048/spidermon/projects/1</a></b></p> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">One of the main components of Cerberus which is validating is now in review, thoroughly tested and extremely powerful. I think Cerberus would be a worthy addition to the validation pipeline. The other component that is the translator, has also been created and will be finished as we go along. </b><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">The next major task that I would like to take on is, having Cerberus to play nice with other pipelines that are Schematics and JSONSchema. Most of that work has been done as well, but it doesn’t work so there is a need for debugging that is all. So, all in all, good work, in the next meeting, my mentors and I will assess and review the milestones for the next period.</b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">What did you love about working with ScrapingHub?</b></h1> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">The thing that truly loved about ScrapingHub is the feeling of working remotely, with some good amount of discipline, and commitment. Google Summer of Code provides us with a great opportunity to truly improve upon on our work, skills and push us out of our comfort zone. I feel great, being able to learn so many things on the fly as well as getting guidance from my awesome mentors. But, it doesn’t push us into a working schedule. There is work that needs to be done for the week and as someone who loves chasing deadlines under pressure, I usually was doing work only around the time of weekends. Well, until I started working with ScrapingHub. </b></p> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">The Scrapy Project and ScrapingHub has been great, I have been getting some good challenges to work towards, and lately, have resolved my shortcomings related to communication as well as the work that was needed to be done. I feel that has a good change coming in, I don’t work that hard. I distribute time evenly over the day, still write a lot of blogs, break down my tasks into smaller bits and look for feedback wherever possible. Life’s good working with ScrapingHub.  </b><br>  </p> <h1 dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">What has the 1st work period taught you in terms professionally as well as mentally? </b></h1> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">The 1st work period helped me realize that things are almost almost never as simple as it seems. The more and more time I spent reading the code, documentation, trying to build a bigger picture in my head. The more I understood how big of a task I am undertaking, this also helped in reassessing the time as well recalibrate the effort that was being put into it. </b><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">I learned about debugging, testing, documentation, module management, python packaging, absolute and relative imports. Defaultdicts, __new__, list comprehensions, code readability, code coverage, logging, and tons of best practices. I am looking forward to learning even more, faster. Leveling up my Python, one step at a time. </b><br>  </p> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">That’s about it for the time, folks. </b></p> <p dir="ltr"><b id="docs-internal-guid-1fd0cb56-7fff-ee07-9c1a-4e565a5a574f">Live in the mix, this is vipulgupta2048 signing off.</b></p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 02 Jul 2019 01:08:30 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/1st-eval-mistake-working-remotely-and-the-special-week-that-has-been-2048/[#5] The week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/5-the-week-that-has-been-2048/<p> </p> <p><meta charset="utf-8"></p> <p dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">Week #5 19/06 to 25/06</b></p> <p dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">The first evaluation is here, got done with a milestone and took a small break for a personal event. </b></p> <h1 dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">What did you do this week?</b></h1> <p dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">This week I had to attend a marriage and hence took a leave from work. I informed my mentors early of my absence from 23th to 25th June, did the work for the week early and now writing the blog post. </b><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">This week, I finished implementing the validate method of Cerberus finally, previously I did the mistake of not implementing through the previous pipeline, hence it was returning the wrong output. Here’s a snippet of its correct working.   </b></p> <pre dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">>>> from spidermon.contrib.validation.cerberus.validator import CerberusValidator</b> <b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">>>> validator = CerberusValidator({'number': {'type': 'number'}, 'name': {'type': 'string'}})</b> <b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">>>> validator.validate({"name": "sda","number":9})</b> <b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">(True, defaultdict(<class>, {}))</class></b> <b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">>>> validator.validate({"price":59,"name": 7,"number":"This is cool"})</b> <b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">(False, defaultdict(<class>, {'name': ['must be of string type'], 'number': ['must be of number type'], 'price': ['unknown field']}))</class></b> <b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">I learned about defaultdict and @property decorators as well as several things about the existing validator pipeline. Kudos to Renne for having the patience to help me understand it.</b> </code> <h1 dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">What is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">Now, we write unittest for the validator following a simple yet effective TDD approach and working towards making the translator of the validator. My college is opening soon and hence would like to get more work done. Next, up I am writing a special post about defaultdict as well as Python decorators.</b></p> <h1 dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">Yes, working remotely is quite a new experience for me. With GSoC, I often try to make the most of it. Somewhere I feel I am lacking, and need to be more disciplined. I thought for this section, I should at least once to talk physcologically rather than problems I am having in my code. Which there is no shortage of at any given moment of time. </b></p> <p dir="ltr"><b id="docs-internal-guid-67c78d2f-7fff-52e4-4875-a1d2152c0b20">See you all next week if I get through the evaluation, fingers crossed. This is Vipul Gupta signing out!</b></p>vipulgupta2048@gmail.com (vipulgupta2048)Wed, 26 Jun 2019 12:25:02 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/5-the-week-that-has-been-2048/[#4] The week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/4-the-week-that-has-been-2048/<p> </p> <p><meta charset="utf-8"></p> <p dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">Week #4 12/06 to 18/06</b></p> <p dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">Well, this has been another rather testing week. </b></p> <h1 dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">What did you do this week?</b></h1> <p dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">We are trying to complete the Validate function for Cerberus, and get it tested for integration. It’s all coming along really well. I got to learn several new tools and services such as IPDB, PyTest, as well as working to add logging for errors in my codebase. I think, things could be better. And that’s what I will be working hard over the next week. </b></p> <p dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">Here’s the project we are going to follow - <a href="https://github.com/vipulgupta2048/spidermon/projects/1">https://github.com/vipulgupta2048/spidermon/projects/1</a></b></p> <h1 dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">What is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">I will be preparing to get into the best shape possible for the first evaluation. I have set some goals for myself regarding the integration of Cerberus. I would like to work hard towards completing each and every one of them to the best of my knowledge. </b></p> <p dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">I wrote a blog about Spidermon Validation pipeline, and how it work - <a href="https://mixstersite.wordpress.com/2019/06/16/sprinkling-some-insight-into-how-spidermon-validation-pipelines-works/">https://mixstersite.wordpress.com/2019/06/16/sprinkling-some-insight-into-how-spidermon-validation-pipelines-works/</a></b></p> <h1 dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-19fde799-7fff-1cbc-81ad-85cf78ede2dc">Several times, regarding debugging the code with IPDB, but Julio helped me fix and validate the process that I was applying including suggesting me to go for PyTest. </b></p>vipulgupta2048@gmail.com (vipulgupta2048)Wed, 19 Jun 2019 05:17:40 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/4-the-week-that-has-been-2048/[#3] The Week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/3-the-week-that-has-been-2048-1/<p><b id="docs-internal-guid-09cfcceb-7fff-36f1-07b9-f10504065402">If the distance is the path traveled between 2 points and displacement between 2 points is the shortest path you can take to reach your destination from the initial point. </b><b id="docs-internal-guid-09cfcceb-7fff-36f1-07b9-f10504065402">Then, I say after making full circles this week my overall displacement is 0. But, I am sure as hell have come a long way in learning more about Python as a programming language by just reading, understanding and implementing new code concepts than ever before. </b><br> <meta charset="utf-8"></p> <p><meta charset="utf-8"></p> <h1 dir="ltr"><b id="docs-internal-guid-d294c620-7fff-475f-56fd-6ab128ede0dc">What did you do this week?</b></h1> <p dir="ltr"><b id="docs-internal-guid-d294c620-7fff-475f-56fd-6ab128ede0dc">I still worked on implementing the Cerberus pipeline - validate() feature. Struggled with some errors but Stackoverflow along with some awesome Python Packaging docs were always there for me. This is taking a bit longer than I realized myself, as I now understand the code and ever wrote the implementation (Which can be found here - <a href="https://github.com/vipulgupta2048/spidermon/tree/cerberus">https://github.com/vipulgupta2048/spidermon/tree/cerberus</a>) several bugs stand in my way to perfect it. Hence, will continue working on that. </b></p> <h1 dir="ltr"><br> <b id="docs-internal-guid-d294c620-7fff-475f-56fd-6ab128ede0dc">What is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-d294c620-7fff-475f-56fd-6ab128ede0dc">Next up, the most immediate task to fix, refactor and get the basic Cerberus validation up and running. This is a priority task for more, as it will be critical for me to get this done to get in a better position of passing Round 1 Evals. </b><b id="docs-internal-guid-d294c620-7fff-475f-56fd-6ab128ede0dc">As I plan to give my project, more focus this next week. I will plan to finish my other tasks, my side project and solve a PR as well. Things should be looking way better in the next report. </b></p> <h1 dir="ltr"><br> <b id="docs-internal-guid-d294c620-7fff-475f-56fd-6ab128ede0dc">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-d294c620-7fff-475f-56fd-6ab128ede0dc">Oh! Tons of bugs, mistakes, and errors were encountered this week. With a lot of time figuring out the Python Packaging and how to install local packages. As I took the rudimentary approach of repackaging Spidermon every time I made a change to it. </b><b id="docs-internal-guid-d294c620-7fff-475f-56fd-6ab128ede0dc">To my surprise, the -e flag in pip install can help us install local packages without the need to re-package. Kudos to Renne for his guidance. I must have never actually made sense to the error that I was getting.</b></p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 11 Jun 2019 20:09:37 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/3-the-week-that-has-been-2048-1/[#2] The week that has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/2-the-week-that-has-been-2048/<p> </p> <p>&lt;meta charset="utf-8"&gt;</p> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Week #2 - 28/05 to 04/06</b></p> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Well, this has been a good week of learning about new things, revising old concepts and reading implementation of one of the oldest modules in Python to understand the idea behind Python Packaging. I feel bad about not able to write a lot of code,  but I think without understanding the existing code base the way forward would have been fruitless, and more disappointing. So, let’s start by answering our 3 infamous questions and later give you a broad picture into Python Packaging as I will try to explain it to you like a 5-year-old. </b><br>  </p> <h1 dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">What did you do this week?</b></h1> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Well, the PR’s that I was working for regarding some critical fixes to docs have been merged. Thanks to Renne, Adrian and Julio for getting my first PR merged. I also worked on creating draft pipeline of validation of data through cerberus. Got completely sidetracked in that regard as I got busy trying to make sense of the entire code flow and answering some serious questions about python packaging, spidermon directory structure, as well as new things such as </b></p> <ol> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Absolute vs relative imports - Why we use them?</b></p> </li> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Code Quality, linting, testing, deployment, and best practices to make it better. </b></p> </li> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Decorators in Python </b></p> </li> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Access modifiers and real usecases in production ready code</b></p> </li> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Just for kicks, I created my own package. It was fun as well as quite a learning experience for me. </b></p> </li> </ol> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Status of the mini-project: Well, after validation bit was completed. I completed its purpose and that’s where I left it. I will start work on PostgreSQL pipelines next, whenever I feel like. </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Otherwise, a good week nonetheless. </b></p> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Issue tracking is now setup with GitHub Projects, Check it out here - <a href="https://github.com/vipulgupta2048/spidermon/projects/1">https://github.com/vipulgupta2048/spidermon/projects/1</a></b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">What is coming up next? </b></h1> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Well, in the recent meeting. My mentors, Renne and Julio did me a solid to help me figure out the real picture of how Spidermon works from start to validate to finish. I am not sure how I would have connected all the bits and pieces involved in this project. Well, coming up next in as far as milestones go is coming up with a basic validate method for Cerberus, equipped with a knowledge of gears, wheels and screws that work behind Spidermon I feel pretty confident about it than the last week. I think, this is the beauty of GSoC. You put in a week of struggle, the next week the struggle doubles but the peak you are climbing starts to look a bit closer. </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Also, in the pipeline is my personal blog on Mixster. Last we talked about my community, I would like to talk about Python Packaging next. In extra work that I want to take up, is the Slack action extra issues features as well as docs. Let’s see if we can get that as well. </b></p> <p dir="ltr"> </p> <h1 dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">Oh, I did. I got badly stuck, but regular visits to Stackoverflow, Python Documentation and my mentors Slack channel helped me get over my troubles. Gotta go, gotta catch up! </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9dfd06d5-7fff-c22b-4bb5-5df6adbb2ef3">This is Vipul Gupta, signing out. </b></p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 04 Jun 2019 18:56:21 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/2-the-week-that-has-been-2048/[#1] The Week That has been @ 2048https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/1-the-week-that-has-been-2048/<p>&lt;meta charset="utf-8"&gt;</p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Week #1 - 21/05 to 27/05 </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">In the last meeting, my mentors and I decided upon the mini-project that I suggested. Here’s a brief overview of what I decided to work with over the course of the last week, </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Main Steps description</b></p> <p style="text-align: center;"><img alt="" height="865" src="https://i.ibb.co/nMxqVj4/photo-2019-05-22-17-54-06.jpg" width="640"></p> <p> </p> <ol> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Will use scrapy to scrape data from given website - https://amity.edu/placement </b></p> </li> </ol> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">data = {</b></p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">   'Link': 'https://amity.edu/placement/Popup.asp?Eid=3895',</b></p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">   'name': 'Impetus - Recruitment Opportunity For 2019 Batch (Apply Now) ',</b></p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">   'year': 2019</b></p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">}</b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">2. Use data validation tools to filter out data</b></p> <ul> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Schematics</b></p> </li> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">JSON Schema</b></p> </li> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Cerberus</b></p> </li> </ul> <p dir="ltr"> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">3. Same project spec for each tool to fully try them out</b></p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">4. Store data in PostgreSQL, this is for sake of completion </b></p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">5. Present data on a website, possible using ReactJS</b><br>  </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">All in all this project will greatly help me develop some good insight on the validation tools popularly used at ScrapingHub and how they work. Coming back to the 3 main questions that we have. </b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">What did you do this week? </b></h1> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Well, for starters I am writing this blog post again. Due to some bug, my original post wasn’t saved. But, no regrets. These are blogs are important and should be written even if I have to write to them again.</b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">My week has been busy with this mini project: </b></p> <ul> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Studied the Schematics validation pipeline, implemented it in my mini project and work out a small bug of the documentation. So, good progress. </b></p> </li> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Implemented the JSON Schema validation, ran through the tutorial to understand the various properties and features. Quite powerful. </b></p> </li> <li dir="ltr"> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Cerberus will take some time to implement, still, need to research the best way to go about it. </b></p> </li> </ul> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">This project is an ongoing thing. As when it gets finished, it would really help me with the development of the Cerberus pipeline whenever that gets completed. I also have been reading about PostgreSQL pipeline for Scrapy and learned new things.</b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">I also went to Google Summer of Code meetup in New Delhi to meet and network with other GSoC’ers here. It was a good time. </b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">What is coming up next? </b></h1> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Next up, I am working on 2 PR’s and fixing an issue related to the Slack actions that have been opened for quite a while. I will also be working to code a draft pipeline of Cerberus, to figure out what goes where. This will be a big Lego project with small parts that need to be stuck together to give a better picture. Looking forward to it. </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">I am also working towards a better issue tracking for my project through Github project and improving the documentation of Spidermon. </b></p> <p> </p> <h1 dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">Did you get stuck anywhere?</b></h1> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">I did, regarding the JSON schema validation implementation. I researched the issue, found several solutions and ran it through my mentors. Turns out the implementation is not listed in the documentation. Will add that too. Busy week ahead. </b></p> <p> </p> <p dir="ltr"><b id="docs-internal-guid-9f92d7a5-7fff-e147-999e-a08d05879343">That’s that from side, this is Vipul Gupta signing out. </b></p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 28 May 2019 17:12:52 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/1-the-week-that-has-been-2048/2048's Weekly Check-in #0https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/2048-s-weekly-check-in-0/<h1><cms-plugin>Weekly check-in #1: 13/05 to 20/05</cms-plugin></h1> <p>Hello everyone, hope you all are doing great. I am Vipul Gupta (goes by vipulgupta2048 all over the web) checking in for the first time under the Scrapy Project. I will be working towards integrating Cerberus into the prime data validation library for our spiders, called Spidermon. You can read all about from <a href="https://summerofcode.withgoogle.com/projects/?sp-search=vipulgupta2048#6742209389395968">here</a>.</p> <h1><span style="font-family: Verdana,Geneva,sans-serif;">What did you do this week?</span></h1> <p>Due to my university, I couldn't accomplish much this week. My college's summer holidays begin from 17th May 2019, hence most of the week was exhausted there. Had a call with mentors, Renne and Julio who will be mentoring me. Renne and Julio are maintainers of Spidermon and employees of ScrapingHub. It was nice to e-meet them, we discussed summer plans, problems that the project is facing that we would be solving over the course of summers. We also set our weekly meeting times, methods to prepare our blogs, ways to pull requests, documentation, code linting etc. Moreover, I decided to understand the working of the present 2 validation techniques that are integrated with Spidermon helping me understand the importance of pipelines, and contribute towards a picture. I thought of a mini-project idea to implement the same, will discuss more about it in weekly blog.  </p> <h1>What is coming up next?</h1> <p>Well, in the next meeting we will be deciding our evaluation by evaluation goals for the project that needs to be completed. This would help both my mentors and me to track my progress and work accordingly. I will also setup the recommended environment in my system, start documenting whatever I am doing. There ain't much to setup now, but I would like to be thorough. I am also working on some documentation for the actions of Spidermon as part of issue #141<br> Will also be working towards researching possible ways to integrate Cerberus into Spidermon. Quite excited for it. </p> <h1>Did you get stuck anywhere?</h1> <p>No specific issues yet, trying to get a bigger picture of what we are trying to do here. </p> <p>Thank you for reading!</p> <p><strong>Vipul Gupta</strong><br> Would love to connect - Twitter? <a href="http://twitter.com/vipulgupta2048">@vipulgupta2048</a> all over the web.</p>vipulgupta2048@gmail.com (vipulgupta2048)Tue, 21 May 2019 14:01:14 +0000https://blogs.python-gsoc.org/en/vipulgupta2048s-blog/2048-s-weekly-check-in-0/