Hello! My name is Ishaan Jain, a Information Technology undergrad at Manipal University Jaipur. I will be working on developing an information theoretic approach to filter out artificial information and real information in geospatial datasets for Xbitinfo during Google Summer of Code 2023.
What did I do this week?
During this week, my main focus was on testing the function responsible for removing artificial information from various datasets. The function utilizes two essential parameters, namely the Cumulative Distribution Function (CDF) and bit information, to accurately identify and eliminate artificial information, returning the true keepbits.
I then applied the function to each dataset and carefully analyzed the behavior of the CDF and bit information for each case. By closely examining the CDF, I identified instances where it started becoming constant. In such situations, I utilized the function to cut out the trailing bits beyond the point where the CDF stabilized. Furthermore, I observed datasets where the bit information started becoming zero. For these cases, I utilized the true keepbits to truncate the trailing bits, effectively eliminating the artificial information from those datasets.
The insights gained from this testing phase will be crucial as we move forward to integrate the function into our data processing pipeline for our project.
What is coming up next?
Will try to tackle artificial information in variables where artificial information dosen't necessarily pop in just trailing mantissa bits.