GSoC Blog Post #3
Akshay_Sharma
Published: 07/13/2021
View Blog Post
GSoC Weekly Check-In #3
Akshay_Sharma
Published: 07/07/2021
Hello Everyone!
The first phase of this year's GSoC program is approaching its end with the first evaluation next week and I am trying my best to finalize the implementation of the MIME sniffing library with proper development as well as testing.
What did you do this week?
I have implemented the section 7
"Determining the computed mime type of a resource". This section covers the main sniffing functions for the library including different sniffing rules like "Identifying a resource with an unknown MIME type", "Sniffing a mislabeled binary resource", "Sniffing a mislabeled RSS XML feed".
What is coming up next?
This week I will apply the testing to section 7 covering all possible test cases to get 100% coverage. Also, I will start to integrate my library as soon as possible once the library is finalized
Did you get stuck anywhere?
Section 7.3 i.e "Sniffing a mislabeled RSS XML feed" was a bit confusing and complicated because of the way
standards represent its pseudocode but mentors were there to help me with that. Other than this there were no major problems last week.
View Blog Post
GSoC Blog Post #2
Akshay_Sharma
Published: 06/29/2021
Hi All,
The fourth week of this years' GSoC program has been completed and I have implemented most major parts of the
MIME standards into my python library including section 4, 5, 6 and some parts of the section 7.
What did you do this week?
I have spent the last week fixing some major issues in the library. The main issue that took most of my time was to fix the implementation of algorithm for matching MIME type pattern in an MP3 file without ID3 tags. ID3 tags covers the contents like artist name, album name, genre, and many more. The algorithm mentioned in standards has various problems that are mentioned in the issue
here. I was finally able to fix the problems with the algorithm taking reference from the
implementation of mozilla for mp3 files. I also worked on my coding style, thanks to my mentor Adrian Chaves for his extremely helpful reviews and suggestions about it and I learned a lot of interesting things too.
What is coming up next?
I have already started with section 7 last week but there is a lot to cover in that including the tests which I will try to cover this week.
Did you get stuck anywhere?
Except for fixing the implementation of algorithm for matching the mime pattern for MP3 file without ID3 tags, last week was interesting and went smoothly.
View Blog Post
GSoC Weekly Check-In #2
Akshay_Sharma
Published: 06/22/2021
Hey Everyone!
Last week was a bit tiring as well as exciting. I have made a quite progress in creating my python library for MIME sniffing and learned a lot of new things about universal clean-coding conventions.
What did you do this week?
I mainly focused only on implementing section 6 of the
MIME standards into the library. This section typically covers the mime matching algorithm. An algorithm to determine a type of file based on predefined patterns by matching the initial bytes of the file with the pattern. The standards mentioned numerous predefined patterns like image file, audio or video file, text file, archive file. There are some special extensions of audio and video files that require different rules of matching the patterns. For e.g matching signature of mp4, WebM, and mp3 files. I have also worked on adding unit tests for the above algorithms that cover every possible test case. One of my mentors also added support for continuous integration to the Github repository which will help to keep an eye on the working of the library and also, will be much easier to debug issues if any.
What is coming up next?
Coming up next is the main algorithm for the library that is "computing the final MIME type". The rules for this algorithm are mentioned in section 7 of the standards and I will try to fully implement it including all the possible tests.
Did you get stuck anywhere?
Yes, the algorithm for matching the signature for Webm files mention in standards was a bit ambiguous. I tried many possible changes to the algorithm, some of them were suggested by my mentors and finally, it worked but I am not 100% sure if the change I made was correct or not. I left it for now as it is working perfectly fine but if something goes wrong in the future I will try to fix it.
View Blog Post
GSoC Blog Post #1
Akshay_Sharma
Published: 06/15/2021
Hey All,
It's already been a week now since the GSoC coding period has begun and I have started working on my project.
What did you do this week?
Like I mentioned in an earlier post, I have designed a high-level API for the python library this week and started to implement the rules mentioned in
MIME sniffing standards. I worked on section 5 according to the standards i.e. "Handling the resource metadata and headers". One of my mentors suggested creating a template for the project before moving on to further coding. Therefore, I set up a template for the library with
setup.py file, added a BSD license file and configured
tox environment for various tests like flake8, typing, py, black.
What is coming up next?
I will start with the implementation of section 6 i.e "Matching a mime-type pattern" and will try to add some tests. Currently, I am using a simple hard-coded test for the library but this week I will try to automate the tests using python unit tests and add more tests as I build the library.
Did you get stuck anywhere?
No, last week went quite seamlessly as I have done similar work earlier, and also, the mentors were always there for suggesting me the best.
View Blog Post