GSoC: Week 3: Awaiting the Future

Published: 06/15/2020

Hello everyone,

What did I do this week?

I have started working on optimizing concurrency of CVE Binary Tool. I am going to use asyncio for IO bound tasks and process pool for long CPU bound tasks. I have converted IO bound synchronous functions  of extractor (PR#741), strings (PR#746) and file(PR#750) modules into asynchronous coroutines. I have also created async_utils module which provides necessary  asynchronous utility functions and classes for every modules. Since asyncio's eventloop doesn't support File IO directly. I have searched external library that may provide functionalities I need and I have found one: aiofiles but it was lacking many functionalities like asynchronous tempfile, shutil etc and It also has many issues and PR opened for more than a year. So, I decided to make one myself. After 2-3 days of research and coding I have finally created an asynchronous FileIO class with all the method that synchronous file object provides and also implemented tempfile's TemporaryFile, NamedTemporaryFile and SpooledTemporaryFile classes on top of it. I have also created asynchronous run_command coroutine which runs command in non-blocking manner since we are using subprocess in many places. I have also converted synchronous unittest to asynchronous by using pytest's pytest-asyncio extension plugin. 

What am I doing this week? 

I am going to refactor scanner into two separate modules: 1) version_scanner and 2) cve_scanner - I am thinking about calling it cve_fetcher to avoid misunderstanding but since I have mentioned cve_scanner in my proposal and issues, let's keep it that for now. I will be merging get_cves methods of cvedb and scanner into one module called cve_scanner which uses cvedb. This will make code more maintainable and readable once I convert it into asynchronous.

Have I got stuck anywhere?

I wasn't able to figure out that Should I use aiofiles and use it to implement functions it lacks or implement one on my own. I was confused because I don't want to reinvent wheels and code-base of  aiofiles was scary at first glance. but then I figured out code of aiofiles is unnecessarily complicated. So, I have borrowed some of their logic and written all the functionality it provides + tempfile functionalities that I need in a compact form.

I am also thinking about making my own library as an alternative to aiofiles which also implements other file IO functionality like shutil and os and deploy it on PyPI.