Weekly Blog #3 (29th Jun - 5th Jul)

arnav_k
Published: 07/06/2020

Hey everyone we are done with the first third of the program and I will use this blog to both give the weekly update as well as summarize the current state of progress. In the past 4 weeks , we have created a new number-parser library from scratch and build an MVP that is being continuously improved.

Last week was spent fine-tuning the parser to retrieve the relevant data from the CLDR RBNF repo. This 'rule based number parser' (RBNF) repo is basically a Java library that converts a number (23) to the corresponding word. (twenty-three) It has a lot of hard-coded values and data that are very useful to our library and thus we plan to extract all this information accurately and efficiently.

In addition to this there are multiple nuances in each of the language that was being taken care , accents in languages. For eg) the french '0' is written as zéro with (accent aigu over the e ) However we don't expect the users to enter these accents each time hence we need to normalise (i.e remove) these accents.

The most challenging aspect was definitely understanding (which I am still not completely clear) the CLDR RBNF structure , there is only a little documentation explaining some of the basic rules however it's tough to identify which are the relevant rules and which aren't.

Originally I was hoping to add more tests as well in this week however all this took longer than expected so the testing aspect is going to be pushed to the current week.

Weekly Blog #3 (29th Jun - 5th Jul)

Versions

Time

Settings from gsoc.settings

Headers

Request

SQL queries from 1 connection

Static files (2312 found, 3 used)

Templates (11 rendered)

Cache calls from 1 backend

Signals

Log messages