GSoC: Week 3: Awaiting the Future

Niraj-Kamdar
Published: 06/15/2020

Hello everyone,

What did I do this week?

I have started working on optimizing concurrency of CVE Binary Tool. I am going to use asyncio for IO bound tasks and process pool for long CPU bound tasks. I have converted IO bound synchronous functions  of extractor (PR#741), strings (PR#746) and file(PR#750) modules into asynchronous coroutines. I have also created async_utils module which provides necessary  asynchronous utility functions and classes for every modules. Since asyncio's eventloop doesn't support File IO directly. I have searched external library that may provide functionalities I need and I have found one: aiofiles but it was lacking many functionalities like asynchronous tempfile, shutil etc and It also has many issues and PR opened for more than a year. So, I decided to make one myself. After 2-3 days of research and coding I have finally created an asynchronous FileIO class with all the method that synchronous file object provides and also implemented tempfile's TemporaryFile, NamedTemporaryFile and SpooledTemporaryFile classes on top of it. I have also created asynchronous run_command coroutine which runs command in non-blocking manner since we are using subprocess in many places. I have also converted synchronous unittest to asynchronous by using pytest's pytest-asyncio extension plugin. 

What am I doing this week? 

I am going to refactor scanner into two separate modules: 1) version_scanner and 2) cve_scanner - I am thinking about calling it cve_fetcher to avoid misunderstanding but since I have mentioned cve_scanner in my proposal and issues, let's keep it that for now. I will be merging get_cves methods of cvedb and scanner into one module called cve_scanner which uses cvedb. This will make code more maintainable and readable once I convert it into asynchronous.

Have I got stuck anywhere?

I wasn't able to figure out that Should I use aiofiles and use it to implement functions it lacks or implement one on my own. I was confused because I don't want to reinvent wheels and code-base of  aiofiles was scary at first glance. but then I figured out code of aiofiles is unnecessarily complicated. So, I have borrowed some of their logic and written all the functionality it provides + tempfile functionalities that I need in a compact form.

I am also thinking about making my own library as an alternative to aiofiles which also implements other file IO functionality like shutil and os and deploy it on PyPI. 

DJDT

Versions

Time

Settings from gsoc.settings

Headers

Request

SQL queries from 1 connection

Static files (2312 found, 3 used)

Templates (11 rendered)

Cache calls from 1 backend

Signals

Log messages