During the community bonding period, i am working on the first step of my proposal. I have used shlex to split the shell script into tokens, and then find the seperator(&&|;) to concatenate the commands. After the review from my mentor, we find that we can improve the code. We do not need to split into tokens at first. Instead, we can directly find the seperator(&&|;) to seperate the commands. This will save a lot of time, since we are not going through every word in the shell script.
Use split_command function and add function to split branch(if and case), and loop(for and while). We will leave the branch not to be parsed since we do not know which branch to be executed. For loop, we will futher develop a function to extract info from it. Here are 2 small steps i am going to do.
1. split command(using seperator(&&|;)), split branch and loop(finding keywords).
2. Extract loop. This is a simple version, we will just extract the commands without considering the loop.
1. Parsing command has already been implemented so this part is a key point. After the following 2 small steps finished, we should be able to parse the shell script in a Dockerfile RUN command to find what software maybe installed.
2. Try to break big issue into small independent ones.