Akshay_Sharma's Blog

GSoC Weekly Check-In #2

Akshay_Sharma
Published: 06/22/2021

Hey Everyone!

Last week was a bit tiring as well as exciting. I have made a quite progress in creating my python library for MIME sniffing and learned a lot of new things about universal clean-coding conventions.

What did you do this week?

I mainly focused only on implementing section 6 of the MIME standards into the library. This section typically covers the mime matching algorithm. An algorithm to determine a type of file based on predefined patterns by matching the initial bytes of the file with the pattern. The standards mentioned numerous predefined patterns like image file, audio or video file, text file, archive file. There are some special extensions of audio and video files that require different rules of matching the patterns. For e.g matching signature of mp4, WebM, and mp3 files. I have also worked on adding unit tests for the above algorithms that cover every possible test case. One of my mentors also added support for continuous integration to the Github repository which will help to keep an eye on the working of the library and also, will be much easier to debug issues if any.

What is coming up next?

Coming up next is the main algorithm for the library that is "computing the final MIME type". The rules for this algorithm are mentioned in section 7 of the standards and I will try to fully implement it including all the possible tests.

Did you get stuck anywhere?

Yes, the algorithm for matching the signature for Webm files mention in standards was a bit ambiguous. I tried many possible changes to the algorithm, some of them were suggested by my mentors and finally, it worked but I am not 100% sure if the change I made was correct or not. I left it for now as it is working perfectly fine but if something goes wrong in the future I will try to fix it.
View Blog Post

GSoC Blog Post #1

Akshay_Sharma
Published: 06/15/2021

Hey All,

It's already been a week now since the GSoC coding period has begun and I have started working on my project.

What did you do this week?

Like I mentioned in an earlier post, I have designed a high-level API for the python library this week and started to implement the rules mentioned in MIME sniffing standards. I worked on section 5 according to the standards i.e. "Handling the resource metadata and headers". One of my mentors suggested creating a template for the project before moving on to further coding. Therefore, I set up a template for the library with setup.py file, added a BSD license file and configured tox environment for various tests like flake8, typing, py, black.

What is coming up next?

I will start with the implementation of section 6 i.e "Matching a mime-type pattern" and will try to add some tests. Currently, I am using a simple hard-coded test for the library but this week I will try to automate the tests using python unit tests and add more tests as I build the library.

Did you get stuck anywhere?

No, last week went quite seamlessly as I have done similar work earlier, and also, the mentors were always there for suggesting me the best.
View Blog Post

Weekly Check-In #1

Akshay_Sharma
Published: 06/07/2021

Hello everyone!!

I am Akshay Sharma, a final year undergrad at Jaypee Institute Of Information Technology, India and a senior certificate student at University Of Florida, USA, majoring in Computer Science. With immense pleasure, I would like to mention that this summer, I will be contributing to "Scrapy Community" under Python Software Foundation, as a GSoC Student. I will be designing a python library for MIME(Multipurpose Internet Mail Extension) Sniffing.

What did you do this week (community bonding period)?

The bonding period went well, I got to interact with my highly experienced mentors (Eugenio Lacuesta, Adrian Chaves) through video conferencing. I have known them for a year through Github but meeting them face-to-face for the first time was great and I found them friendly & helpful. We discussed the overall timeline, the implementation details and other requirement necessary for my project. Besides bonding period, I spent most of the time reading documentation and understanding the codebase of other MIME sniffing libraries like mimetypes, python-magic. This helped to get a gist of what I will be working on throughout the GSoC program. I also got a name for my library which is xtractmime.

What is coming up next?

The coding period has started! and I will be designing the high level API for my MIME sniffing library this week. The API will take HTTP response as input and will return a mimetype as output.

Did you get stuck anywhere?

I got stuck a little while deciding a starting point for the library as unlike other projects, I will be creating a library from scratch. After a discussion with mentors, we finalised the necessary inputs for the API so that later there will be less trouble making the changes and also helped me to start with the coding.
View Blog Post