Part of the journey is the end | Final Report

Published: 08/24/2021

Hi, community

Part of the journey is the end. It is time for me to work on my final work report for final evaluation of Google Summer of Code 2021. This week, I will devote my time mainly to write my final report.

Final Work Submission Report

Hello Everyone! My name is Gavish Poddar and I'm excited to tell you about my GSoC journey. For the past couple of months, I have been working on an awesome project dateparser. The dateparser aims to parse datetime from a string.

My GSoC journey would not have been successful without the guidance of my mentors Marc Hernández, Konstantin Lopuhin and Kishan Mehta.

What I Have Learned?

The whole GSoC journey was full of learning thanks to my mentors. I learned how to find good open source dependencies to include in our project. I tried my hands on improving code coverage and writing tests for the code. I learned how to optimize code and the need for extensive research before feature addition.

What I Have Contributed?

As mentioned in my proposal I worked on the implementation of the Optional Language Detection for dateparser and fixing as many issues as possible in the search_dates function of the dateparser.

Optional Language Detection

PR - Optional Language Detection

Implemented optional language detection to improve language detection. This allows to plug in any language detection library with the dateparser. Out of the box, dateparser supports two libraries fasttext and langdetect. The optional language detection works with both parse and search_dates. This PR also introduces a new setting DEFAULT_LANGUAGES which is used if no language is detected by default language detection and the optional language detection.

Reimplimenting search_dates (extended goal)

PR - Reimplimenting search_dates

A reimplemented and simplified search_dates improves the results and fixes many issues. The entire search_dates is newly implemented and would be easier to maintain. This PR introduces a new feature search_first_date which returns the first date in the given string. This PR also fixes around 13 issues.

Other search_dates improvements

Adding support for date-related objects last decade, next decade, etc in search_date. This PR fixes 1 issue.

PR - Improvements in locale:translate_search fixes

Adding support search_date period separator support. Date string like 23.12.2000 can be parsed. This PR fixes 5 issues.

PR - search_date period separator support

Other Important Details

As part of our GSoC project, Python Software Foundation requires us to post a weekly blog where we usually post about what we have done in the week and what is coming up next. We can also write about any blockages or issues we are facing. I have also written my weekly blogs so if you want to know weekly details of my project you can refer them here.

Weekly Blogs

Future Work and Final Note

The project is very actively maintained the new search_dates and my contributions would improve the library. The main goal of the proposal is achived with the implimentation of the optional language detection and the PR is mergeable. I plan to keep working on the project and contribute as much as I can. Contribute to the search_dates function of the library (dateparser) would be my primary goal.

It was overall a wonderful experience and I learned a lot.

I would like to thank Google, Python Software Foundation and Zyte for providing me with the opportunity and my mentors Marc Hernández, Konstantin Lopuhin and Kishan Mehta.

Thank you for reading!