adityaa30's Blog

Weekly Check In - 12

adityaa30
Published: 08/30/2020

What did I do till now?

Last week I was working on finishing up the HTTPNegotiateDownloadHandler. Presently the download handler uses ALPN or NPN (whichever is available) to negotiate a protocol (presently one of HTTP/1.1 or HTTP/2) from the remote server and issues the requests on the respective download handler. Presently, all requests made via proxy are directly issued using the HTTP11DownloadHandler

What's coming up next? 

I plan on continue working on implementing the CONNECT method for HTTP/2. 

Did I get stuck anywhere?

Yep. I was stuck for almost a week on the CONNECT protocol. Now, I have managed to fix the bug where the raw TCP connection instance could not be switched to HTTP/2. However, there are some issues during the TLS handshake with the final target resource 😥.  

View Blog Post

Weekly Check In - 11

adityaa30
Published: 08/22/2020

What did I do till now?

Last week, I finished finalizing the PR for the basic implementation of the H2ClientProtocol. The protocol now works with all the request methods except the CONNECT method. The work on Tunneling using CONNECT method is still in progress. I started with creating another protocol for negotiation which uses ALPN or NPN (whichever is available) to negotiate a protocol (presently one of HTTP/1.1 or HTTP/2) from the remote server based on the priority given by the user via the Scrapy project's settings and then uses the respective download handler to complete the request. 

What's coming up next? 

This week I am majorly working on finishing the Negotiation Protocol.

Did I get stuck anywhere?

Nope. I spent more time on finalizing a clean architure last week so mostly my time went in planning. Apart from that there were no major blockers :) 

View Blog Post

Weekly Check In - 10

adityaa30
Published: 08/13/2020

What did I do till now?

I started implementing the CONNECT method for Tunneling via HTTP/2. After a lot of testing, I realized the approach I was taking was not really feasible, hence next I plan to work on an approach which initially uses HTTP/1.1 CONNECT to establish a connection with the proxy and then shifts to HTTP/2 for all the requests made via proxy. 

What's coming up next? 

Next week, I plan to

  • Make the PR for H2ClientProtocol ready to be merged with master - verify all cases covered via tests, other tests pass and there are no bugs introduced
  • Implement the CONNECT method using combination of HTTP/1.1 and HTTP/2

Did I get stuck anywhere?

Yes, this week I had many problems while adding support for tunneling for proxies. I have planned completely another approach for next week using HTTP/1.1 and HTTP/2. Let's see how it goes :) 

View Blog Post

Weekly Check In - 9

adityaa30
Published: 08/06/2020

What did I do till now?

Last week I completed the ScrapyH2ProxyAgent implementation and added the required tests. I was going through the codebase for hyper-h2 library to get insight on how they implemented CONNECT method for HTTP/2. 

What's coming up next?

Next week I plan to finish working on ScrapyTunnelingH2Agent which enables a user to create a SSL Tunnel and proxy requests.

Did I get stuck anywhere?

Yeah I am stuck at a weird problem where two test cases are colliding i.e none of them being related to each other but fails when I run them both together and passes when I run them separately. I'm still working on finding a working fix! 

View Blog Post

Weekly Check In - 8

adityaa30
Published: 07/30/2020

What did I do till now?

Last week I added tests for H2Agent and H2DownloaderHandler

What's coming up next?

Next week I plan to continue working on ScrapyTunnelingH2Agent.

Did I get stuck anywhere?

Yes. I got stuck for a long time while setting up the testing environment of H2DownloaderHandler. The problem was a bit weird one, till now Scrapy was using the Twisted's WrappingFactory class to wrap the Site instance, which allows only upto HTTP/1.1 (for unknown reasons) which took me a long time to realize. After removing the WrappingFactory, the tests environment was setup as required. Apart from this another hurdle I'm still facing is about the CONNECT Protocol in HTTP/2.0, I couldn't really find much blogs/articles on this to get a better idea. I plan to look at some open-source libraries' implementation of HTTP/2.0 CONNECT now.  

View Blog Post
DJDT

Versions

Time

Settings from gsoc.settings

Headers

Request

SQL queries from 1 connection

Static files (2312 found, 3 used)

Templates (28 rendered)

Cache calls from 1 backend

Signals

Log messages