Niraj-Kamdar's Blog

GSoC: Week 10: ''' Documentation '''

Niraj-Kamdar
Published: 08/03/2020

Hello guys, 

I hope you all doing great. Today, I am going to talk about what I did in this week.

What did I do this week?

I am working on documentation of code I have produced during the first two phases. I have changed user manual and readme. I am also going to change other documentation. I have created user manual for new input engine features and config file feature.

What am I doing this week? 

I have talked with a user and we come to conclusion that our documettion lacks some important How-to guides which are necessary as mentioned by Daniele Procida in his amazing PyCon talk. So, I am going to create a How-to directory inside our doc folder which will contain interesting recipes for different usecases. Ex:

  1. How to change theme of html?
  2. How to add custom checker (out of tree checker)?
  3. How to scan docker image?
  4. How to parallel scan?

Have I got stuck anywhere?

No, I didn't get stuck anywhere this week.

 

View Blog Post

GSoC: Week 9: ConfigParser()

Niraj-Kamdar
Published: 07/26/2020

What did I do this week?

I have done research on various configuration file formats and compiled outcomes of it in a issue:  Discussion: Configuration file format. Some users recommended INI files because it is very old and still popular among masses but  INI file does not have any built-in type support and It also lacks formal specification. It parses everything as string. So, we have to process data parsed by configparser to convert it into something usable.
Our example data can be parsed as following dictionary:

{
    "checker": {
        "runs": "[curl,binutils]",  # This has to be transformed into list 
        "skips": "[python,bzip2]"
    },
    "input": {
        "directory": "test/assets",
        "input_file": "test/csv/triage.csv"
    },
}

So, parsing INI file won't be as easy as TOML or YAML which supports complex datatypes by default. It is also not easy to parse other datatypes like integer, float etc.

TOML is very similar to INI file and TOML also supports complex data types by default.

{
    'checker': {
        'runs': ['curl', 'binutils'],  # this is correctly parsed as list
        'skips': ['python', 'bzip2']
    },
    'input': {
        'directory': 'test/assets',
        'input_file': 'test/csv/triage.csv'
    },
}

I concluded that TOML and YAML are both very easy to read and write by both machine and human. So, we should use one of them. We have discussed which format to use in meeting and my mentors had various opinions on it. Summary of our discussion was: "The top contenders among our team seem to be TOML (readable, familar to python folk and close enough to INI for skill transfer for windows folk) and YAML (which might be a better fit for the dev-ops community that we hope will be among the biggest users of cve-bin-tool)."

Since Parsers for both formats produce similar python structures, I have created ConfigParser class which can parse both YAML and TOML file format. I have also added basic tests for it. I have also changed architecture of main function of cli.py to add support for config files and I also made sure that option given from terminal get preference over config option. I am also going to add tests for this. I have also fixed quiet mode bugs.

What am I doing this week? 

I am going to write tests for config files in test_cli.py and since I have completed almost all work related to InputEngine, I think it's good time to document it. 

Have I got stuck anywhere?

Yes, I need my Quiet mode bug fix PR merged since I have changed TestCLI in it and I need latest TestCLI for testing ConfigParser.

 

View Blog Post

GSoC: Week 8: InputEngine.extend(functionalities)

Niraj-Kamdar
Published: 07/19/2020

What did I do this week?

I didn't know about usage of other triage data like custom severity so I asked my mentor about it she gave me various use-case scenarios where it can be useful. After understanding requirements, I have added support for three new fields to our input_engine: 1) comments, 2) cve_number and 3) severity. Now user can specify these triage data and it will get reflected in the all machine readable output format. I have also added support for wheel and egg archive format. I have modernize error handling in outputengine and extractor. I have also fixed a bug which was causing progress bar to be displayed on quite mode. 

What am I doing this week? 

I am going to work on configuration file this week. I most likely going to choose toml as our config file format as recommended by PEP. 

Have I got stuck anywhere?

No I didn't stuck anywhere this week.

View Blog Post

GSoC: Week 7: with ErrorHandler()

Niraj-Kamdar
Published: 07/12/2020

What did I do this week?

This week my mentor has pointed out several issues in my InputEngine PR and I fixed those this week.I have fixed Issue: Use patterns in VERSION_PATTERNS as valid CONTAINS_PATTERNS by default and for that I have changed checker metaclass to include VERSION_PATTERNS by default as valid CONTAINS_PATTERNS. I also changed mapping test data of all checkers and removed redundant CONTAINS_PATTERNS. I have also fixed Escape sequence issue. I have also created an error_handler module which provides ErrorHandler context manager. It will display colorful traceback and set custom exit code. Currently, It supports four different modes for error handling:

  1. TruncTrace - displays truncated traceback (default)
    • trucated traceback output
  2. FullTrace - displays full traceback (when logging level is debug can be set via -l debug option)
    • Full traceback output
  3. NoTrace - displays no traceback (when logging level is critical can be set via -q(--quiet) flag)
    • no traceback output
  4. Ignore - Ignore any raised Exception (Only used internally.)

I have moved all custom exception in error_handler module so that it would be easy to assign error code. I have also changed excepthook to display colorized output traceback. I have also changed unittest for cli and input_engine to incorporate changes in exception handling.  If one raise error without context manager he will get full traceback regardless of mode he set. So, always use ErrorHandler context manager to raise exception or around the code that can raise exception.

What am I doing this week? 

I am going to improve InputEngine and Extractor modules this week.

Have I got stuck anywhere?

I wanted to improve InputEngine this week but Ideas discussed in issue related to the other functionalities of InputEngine aren't clear so I wanted to discuss future plans for InputEngine in this week's meeting but unfortunately mentors were busy this week so meeting got canceled but terriko has opened issues regarding exceptions and I got an idea to colorize traceback and extend functionality of custom error codes for every modules so I have done that instead and as you can see it looks awesome now.

 

 

View Blog Post

GSoC: Week 6: class InputEngine

Niraj-Kamdar
Published: 07/06/2020

What did I do this week?

I have started working on input engine this week. Currently, we only have csv2cve which accepts csv file of vendor, product and version as input and produces list of CVEs as output. Currently, csv2cve is separate module with separate command line entry point. I have created a module called input_engine that can process data from any input format (currently csv and json).User can now add remarks field in csv or json which can have any value from following values ( Here, values in parenthesis are aliases for that specific type. )

  1. NewFound (1, n, N)
  2. Unexplored (2, u, U)
  3. Mitigated, (3, m, M)
  4. Confirmed (4, c, C)
  5. Ignored (5, i, I)

I have added --input-file(-i) option in the cli.py to specify input file which input_engine parses and create intermediate data structure that will be used by output_engine to display data according to remarks. Output will be displayed in the same order as priority given to the remarks. I have also created a dummy csv2cve which just calls cli.py with -i option as argument specified in csv2cve. Here, is example usage of -i as input file to produce CVE:  cve-bin-tool -i=test.csv  and User can also use -i to supplement remarks data while scanning directory so that output will be sorted according to remarks. Here is example usage for that: cve-bin-tool -i=test.csv /path/to/scan.

I have also added test cases for input_engine and removed old test cases of the csv2cve.

What am I doing this week? 

I have exams this week from today to 9th July. So, I won't be able to do much during this week but I will spend my weekend improving input_engine like giving more fine-grained control to provide remarks and custom severity.

Have I got stuck anywhere?

No, I didn't get stuck anywhere this week :)

View Blog Post