Part of the journey is the end. It is time for me to work on my final work report for final evaluation of Google Summer of Code 2021. This week, I will devote my time mainly to write my final report.
Final Work Submission Report
- Name: Gavish Poddar
- Organisation: Python Software Foundation
- Sub-Organisation: Zyte
dateparserBetter language detection & reimplementing
- Proposal: dateparser - better language detection
Hello Everyone! My name is Gavish Poddar and I'm excited to tell you about my GSoC journey. For the past couple of months, I have been working on an awesome project
dateparser. The dateparser aims to parse
datetime from a string.
What I Have Learned?
The whole GSoC journey was full of learning thanks to my mentors. I learned how to find good open source dependencies to include in our project. I tried my hands on improving code coverage and writing tests for the code. I learned how to optimize code and the need for extensive research before feature addition.
What I Have Contributed?
As mentioned in my proposal I worked on the implementation of the Optional Language Detection for dateparser and fixing as many issues as possible in the
search_dates function of the
Optional Language Detection
Implemented optional language detection to improve language detection. This allows to plug in any language detection library with the dateparser. Out of the box, dateparser supports two libraries
langdetect. The optional language detection works with both parse and search_dates. This PR also introduces a new setting
DEFAULT_LANGUAGES which is used if no language is detected by default language detection and the optional language detection.
search_dates (extended goal)
A reimplemented and simplified
search_dates improves the results and fixes many issues. The entire search_dates is newly implemented and would be easier to maintain. This PR introduces a new feature
search_first_date which returns the first date in the given string. This PR also fixes around 13 issues.
Adding support for date-related objects
next decade, etc in
search_date. This PR fixes 1 issue.
search_date period separator support. Date string like
23.12.2000 can be parsed. This PR fixes 5 issues.
Other Important Details
As part of our GSoC project, Python Software Foundation requires us to post a weekly blog where we usually post about what we have done in the week and what is coming up next. We can also write about any blockages or issues we are facing. I have also written my weekly blogs so if you want to know weekly details of my project you can refer them here.
Future Work and Final Note
The project is very actively maintained the new
search_dates and my contributions would improve the library. The main goal of the proposal is achived with the implimentation of the optional language detection and the PR is mergeable. I plan to keep working on the project and contribute as much as I can. Contribute to the
search_dates function of the library (
dateparser) would be my primary goal.
It was overall a wonderful experience and I learned a lot.
Thank you for reading!