Weekly Check-in #11: ( 2 Aug - 8 Aug )

Published: 08/06/2019

What did you do this week?

  • Added API description and more usage examples to readme.
  • Added PyPy test environment.
  • Opened a pull request to add Protego integration to Scrapy.
  • Modified Protego to treat non-terminal dollar signs as ordinary characters.
  • Minor aesthetic changes. 

What is coming up next?

  • Transferring the Protego repository to Scrapy organisation on GitHub. It seems that write permissions are necessary for initiating the transfer process.
  • Would modify Protego to treat wildcards such as `*` and `$` as ordinary characters as well.  
  • Would modify `SitemapCrawler` to use the new interface. 
  • Implementing support for `host` & `crawl-delay` directives in Scrapy. 
  • Some performance improvements might be possible by using a custom pattern matching logic (in place of regex), but I am not sure. I will need to test it.

Did you get stuck anywhere?

  • Faced problems setting up PyPy test environment. With help from mentors, I was able to solve the issue.