BreadGenie's Blog

GSoC - Week 1

BreadGenie
Published: 06/15/2021

Hello everyone!

What did you do this week?

I have almost completed the package list parser for python packages and made a checker for dpkg. I have also tried to improve checker support for python packages and hopefully won't hit false positives.

Some insights on why package list parsers are made when we already have checkers

  • Package list parsers are much faster than the usual scanning with checkers.

    Package list parsers takes an input, requiremens.txt, where the package names are listed and then use the pip freeze command to collect the installed python packages to find product name and version values and is then compared with the requirements.txt to filter out the needed packages. Then the vendor values are fetched from a CSV file containing vendor, product values. Finally the vendor, product, version values are used to query the CVE database.
    But the the checkers have to scan the files in a package one by one to find the necessary version strings and product name which consumes a lot of time.

    Benchmarks


    That's ~7.3x faster than using checkers. :D (Scanning all my user installed python packages)
  • Package list parsers can detect more products than checkers

    Since package list parser doesn't depend on checkers it can detect more vendor-product pairs than scanning using checkers (but can hit some false positives too, eg: commonmark and zstandard).
    Here 4 unique products are scanned while using checkers and 9 unique products (just the products with vendor-product pairs in the CVE database)while using package list parser.

What is coming up next?

Refactoring the parser code for better runtime, writing the docs for parser and adding checkers.

Did you get stuck anywhere?

No.
View Blog Post

GSoC - Week 0

BreadGenie
Published: 06/09/2021

Hello Everyone!
I'm Muhammed Suhail, a pre-final year student at GEC Palakkad.

I'll be working on CVE Binary Tool this summer on adding a tool for the CVE Binary tool that reads a package list (like requirements.txt) and scan for CVEs for the packages in the list which will immensely improve the time for scanning packages compared to binary scans.

What did you do this week?

I worked on implementing the parser specifically for the PyPI packages list, which takes a requirements.txt (for now) file as an input using a -L or --package-list flag and extracts the necessary values for the CVE Binary Tool to check for CVEs under the hood.

What will you be doing for the rest of the week?

I will be further improving the parser for PyPI packages and will be adding the checkers for some of those packages to the CVE Binary Tool.

Did you get stuck anywhere?

Yup, I had a bit of difficulty in understanding pytest parametrization.
Also one of the tests I wrote is unstable for now. So I will be rewriting that after I brainstorm how to make it stable this week.

Looking forward to a fruitful summer with mentors and fellow contributors :D
View Blog Post