McSinyx's Blog

Sorting Things Out

McSinyx
Published: 08/03/2020

Hi! I really hope that everyone reading this is still doing okay, and if that isn't the case, I wish you a good day!

pip 20.2 Released!

Last Wednesday, pip 20.2 was released, delivering the 2020-resolver as well as many other improvements! I was lucky to be able to get the fast-deps feature to be included as part of the release. A brief description of this experimental feature as well as testing instruction can be found on Python Discuss.

The public exposure of the feature also remind me of some further optimization to make on the lazy wheel. Hopefully without download parallelization it would not be too slow to put off testing by concerned users of pip.

Preparation for Download Parallelization

As of this moment, we already have:

  • Multithreading pool fallback working
  • An opt-in to use lazy wheel to optain dependency information, and thus getting a list of wheels at the end of resolution ready to be downloaded together

What's left is only to interject a parallel download somewhere after the dependency resolution step. Still, this struggles me way more than I've ever imagined. I got so stuck that I had to give myself a day off in the middle of the week (and study some Rust), then I came up with something what was agreed upon as difficult to maintain.

Indeed, a large part of this is my fault, for not communicating the design thoroughly with pip's maintainers and not carefully noting stuff down during (verbal) discussions with my mentor. Thankfully Chris Hunt came to the rescue and did a refactoring that will make my future work much easier and cleaner.

View Blog Post

Fifth Check-In

McSinyx
Published: 07/27/2020

Hello and I hope y'all are still doing well!

What did I do last week?

I was not really productive last week—most of the following tickets are fillers to make use of the spare cycles I had when I was still trying to figure out the way to implement the main work.

  • Finalize the --use-feature=fast-deps flag (GH-8588)
  • Improve mocking of environment variables in the test suit (GH-8614)
  • Finalize the fix for verbose/quiet options specified via configuration files and environment variables (GH-8578)
  • Clean up a tiny bit in the resolver internal API (GH-8629)
  • Start working on seperating the download of wheels from dependency resolution (GH-8638)

Did I get stuck anywhere?

I'm struggling on refactoring the code to support separate download. pip's codebase was not intended for this and thus there are many execution paths and other details entangled around the relevant area.

What is coming up next?

pip 20.2 is going to be released within the next few days with --use-feature=fast-deps included and I'm mentally prepare to fix any undiscovered problem. At the same time, I will continue working on GH-8638 and hopefully get it done soon enough to begin drafting download parallelization strategies, mostly with the UI.

View Blog Post

I've Walked 500 Miles…

McSinyx
Published: 07/20/2020

… and I would walk 500 more
Just to be the man who walks a thousand miles
To fall down at your door

The Main Road

Hi, have you met fast-deps? It's (going to be) the name of pip's experimental feature that may improve the speed of dependency resolution of the new resolver. By avoid downloading whole wheels to just obtain metadata, it is especially helpful when pip has to do heavy backtracking to resolve conflicts.

Thanks to Chris Hunt's review on GH-8537, my mentor Pradyun Gedam and I worked out a less hacky approach to inteject the call to lazy wheel during the resolution process. A new PR GH-8588 was filed to implement it—I could have just worked on top of the old PR and rebased, but my git skill is far from gud enough to confidently do it.

Testing this one has been a lot of fun though. At first, integration tests were added as a rerun of the tests for the new resolver, with an additional flag to use feature fast-deps. It indeed made me feel guilty towards Travis, who has to work around 30 minutes more every run. Per Chris Hunt's suggestion, in the new PR, I instead write a few functional tests for the area relating the most to the feature, namely pip's subcommands wheel, download and install.

It was also suggested that a mock server with HTTP range requests support might be better (in term of performance and reliablilty) than for testing. However, I have yet to be able to make Werkzeug do it.

Why did I say I'm half way there? With the parallel utilities merged and a way to quickly get the list of distribution to be downloaded being really close, what left is only to figure out a way to properly download them in parallel. With no distribution to be added during the download progress, the model of this will fit very well with the architecture in my original proposal. A batch downloader can be implemented to track the progress of each download and thus report them cleanly as e.g. progress bar or percentage. This is the part I am second-most excited about of my GSoC project this summer (after the synchronization of downloads written in my proposal, which was then superseded by fast-deps) and I can't wait to do it!

The Side Quests

As usual, I make sure that I complete every side quest I see during the journey:

  • GH-8568: Declare constants in configuration.py as such
  • GH-8571: Clean up Configuration.unset_value and nit the class' __init__
  • GH-8578: Allow verbose/quite level to be specified via config file and env var
  • GH-8599: Replace tabs by spaces for consistency

Snap Back to Reality

A bit about me, I actually walked 500 meters earlier today to a bank and walked 500 more to another to prepare my Visa card for purchasing the upcoming Pinephone prototype. It's one of the first smartphones to fully support a GNU/Linux distribution, where one can run desktop apps (including proper terminals) as well as traditional services like SSH, HTTP server and IPFS node because why not? Just a few hours ago, I pre-ordered the postmarketOS community edition with additional hardware for convergence.

If you did not come here for a Pinephone ad, please take my apologies though d-; and to ones reading this, I hope you all can become the person who walks a thousand miles to fall down at the door opening to all what you ever wished for!

View Blog Post

Fourth Check-In

McSinyx
Published: 07/13/2020

Hello there! I'm having my second year's last exam tomorrow, but it feels like summer already! I've been finalizing quite a few things to get them ready for pip 20.2b2.

What did I do last week?

I've spent most of the time on getting the opt-in of obtaining dependency information via lazy wheels ready. It will be available as --use-feature=fast-deps and only has effect when --use-feature=2020-resolver also presents.

While waiting for reviews and suggestions, I made some patches for internal cleansing, namely GH-8568, GH-8571 and GH-8578. Some of the similar patches I made earlier were also merged last week: GH-8456 and GH-8538.

Did I get stuck anywhere?

Not really, everything was going as expected for me.

What is coming up next?

After GH-8532, I'll work on the parallel download of the postponed wheels. My main current concern is with how the download progress will be reported to the users, but I think I'll figure it out soon.

View Blog Post

I'm Not Drowning On My Own

McSinyx
Published: 07/06/2020

Cold Water

Hello there! My schoolyear is coming to an end, with some final assignments and group projects left to be done. I for sure underestimated the workload of these and in the last (and probably next) few days I'm drowning in work trying to meet my deadlines.

One project that might be remotely relevant is cheese-shop, which tries to manage the metadata of packages from the real Cheese Shop. Other than that, schoolwork is draining a lot of my time and I can't remember the last time I came up with something new for my GSoC Project )-;

Warm Water

On the bright side, I received a lot of help and encouragement from contributors and stakeholders of pip. In the last week alone, I had five pull requests merged:

  • GH-8332: Add license requirement to _vendor/README.rst
  • GH-8320: Add utilities for parallelization
  • GH-8504: Parallelize pip list --outdated and --uptodate
  • GH-8411: Refactor operations.prepare.prepare_linked_requirement
  • GH-8467: Add utitlity to lazily acquire wheel metadata over HTTP

In addition to helping me getting my PRs merged, my mentor Pradyun Gedam also gave me my first official feedback, including what I'm doing right (and wrong too!) and what I should keep doing to increase the chance of the project being successful.

GH-7819's roadmap (Danny McClanahan's discoveries and works on lazy wheels) is being closely tracked by hatch's maintainter Ofek Lev, which really makes me proud and warms my heart, that what I'm helping build is actually needed by the community!

Learning How To Swim

With GH-8467 and GH-8530 merged, I'm now working on GH-8532 which aims to roll out the lazy wheel as the way to obtain dependency information via the CLI flag --use-feature=lazy-wheel.

GH-8532 was failing initially, despite being relatively trivial and that the commit it used to base on was passing. Surprisingly, after rebasing it on top of GH-8530, it suddenly became green mysteriously. After the first (early) review, I was able to iterate on my earlier code, which used the ambiguous exception RuntimeError.

The rest to be done is just adding some functional tests (I'm pretty sure this will be either overwhelming or underwhelming) to make sure that the command-line flag is working correctly. Hopefully this can make it into the beta of the upcoming release this month.

In other news, I've also submitted a patch improving the tests for the parallelization utilities, which was really messy as I wrote them. Better late than never!

Metaphors aside, I actually can't swim d-:

Dive Plan

After GH-8532, I think I'll try to parallelize downloads of wheels that are lazily fetched only for metadata. By the current implementation of the new resolver, for pip install, this can be injected directly between the resolution and build/installation process.

View Blog Post