Articles on anandbaburajan's Bloghttps://blogs.python-gsoc.orgUpdates on different articles published on anandbaburajan's BlogenMon, 31 Aug 2020 06:48:25 +0000GSoC: Week #14https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-14/<p>Hello!</p> <p><b>What did you do this week?</b></p> <p>I wrote docs, looked into K2IS’ sync_flag and completed implementing positive sync_offset for HDF5. I also extracted the recursive slice splitting code in HDF5 into its own function and wrote tests for it. While working on HDF5, I faced a strange bug in which the unit tests passed when ran on real datasets but failed when ran on randomly generated temporary data.</p> <p><b>What is coming up next?</b></p> <p>Handle combinations of sync_offset and reshaping, fix sig_shape reshaping and the bug mentioned above for HDF5. I’ve almost figured out how the sectors are synced in K2IS, so hope to finish it soon. I’ve a few more things on my todo list before my PR can be merged and I think this would be my last GSoC blog, so thanks a lot for reading!</p> <p><b>Did you get stuck anywhere?</b></p> <p>No, but I must say that the best and most challenging part of this journey was deciding between different algorithmic/design solutions to solve problems in my project.</p> <p>Thanks again to my mentors, PSF and Google for this opportunity!</p> <p>:-)</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 31 Aug 2020 06:48:25 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-14/GSoC: Week #13https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-13/<p>Hello!</p> <p><b>What did you do this week?</b></p> <p>I worked on a recursive function which splits tile_slices in order to properly read from reshaped HDF5 files. Now the scan dimensions can be reshaped for n-D HDF5 data!! But there’s a bug to be fixed before the detector dimensions can be reshaped. Splitting tiles was much faster than reading each frames using their indices. I also improved the coordinates generation by making it a cached property and I fixed some bugs in the RAW and EMPAD dataset to fix the errors thrown while specifying a shape larger than their image_count.</p> <p><b>What is coming up next?</b></p> <p>I’m working on HDF5’s sig_shape reshaping and sync_offset. K2IS’ sync_flag is still WIP. A few changes need to be done on the client side to handle the deprecation of ‘scan_size’ and ‘detector_size’. I’ll update the docs as well.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No.</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 24 Aug 2020 06:54:42 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-13/GSoC: Week #12https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-12/<p>Hello!</p> <p><b>What did you do this week?</b></p> <p>I improved the coordinates generation code according to the feedback I got from my mentors. Now it performs better and is more efficient as only the UDFs which need the coordinates call the generation code. There are quite a lot of ways to read from a HDF5 file, so to find the optimal solution to HDF5’s sync_offset problem, I created another prototype to read binary data from a HDF5 file using indices but it turned out to be 10x slower than reading slices directly using read_direct. I also got a better understanding of the K2-IS dataset’s meta and implementation in LiberTEM.</p> <p><b>What is coming up next?</b></p> <p>I’ll continue working on HDF5 and K2IS's sync flag. There are a few more things I can do to improve the coordinates generation code, so I’ll work on that. There are also a few bugs to be fixed and usability improvements to be made on the client side before the PR can be merged.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No.</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 17 Aug 2020 04:14:48 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-12/GSoC: Week #11https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-11/<p>Hello!</p> <p><b>What did you do this week?</b></p> <p>I added the feature for getting the coordinates of frames in a slice for UDFs, wrote tests, fixed a few bugs and continued working on HDF5.</p> <p><b>What is coming up next?</b></p> <p>I’m finally going to work on the integration of sync_offset into K2IS's sync flag. I’ll continue working on HDF5’s sync_offset and make sure it works with reshaping and ROI. Also make sure that FRMS6's dark frames correction works properly with a sync_offset.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No.</p>anandbaburajan@gmail.com (anandbaburajan)Sun, 09 Aug 2020 17:54:46 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-11/GSoC: Week #10https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-10/<p>Hi there!</p> <p><b>What did you do this week?</b></p> <p>I completed ‘sync_offset’ for SER, fixed a few bugs, wrote tests for ‘reshaping’ all the formats and continued working on HDF5, FRMS6 and K2IS.</p> <p><b>What is coming up next?</b></p> <p>My next aim is to figure out how to integrate the reshape feature into the UDF interface and write tests for the same. This would be helpful to get the coordinates in the actual navigation space for UDFs. I’ll also continue working on offset for K2IS and HDF5. The FRMS6 format is almost done, I just need to figure out how to handle the dark frame correction when the signal dimensions are changed.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No.</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 03 Aug 2020 08:34:27 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-10/GSoC: Week #9https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-9-1/<p>Hello!</p> <p><b>What did you do this week?</b></p> <p>I added ‘nav_shape’ and ‘sig_shape’ parameters for all the formats, implemented reshaping for formats using the general ‘get_tiles’, reworked on the API and the GUI forms.</p> <p><b>What is coming up next?</b></p> <p>My mentor suggested a much better way to implement ‘sync_offset’ into the HDF5 format, so I’ll work on that this week. I’ll also try to finish the offset feature for K2IS, FRMS6 and the SER format too, which uses its own I/O functions. There’s a small bug with the DM format which needs to be fixed. I’ll also write suitable tests for the reshaping feature for all the formats.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No.</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 27 Jul 2020 08:12:41 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-9-1/GSoC: Week #8https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-8/<p>Hello!</p> <p><b>What did you do this week?</b></p> <p>I finally completed the sync offset feature, with and without a ROI, for the HDF5 format! I spent some time to understand the FRMS6 format, found a bug with the ROI selector in the GUI and opened an issue for the same. I also introduced a new member to our community and he’s interested to improve the UI which I’m really happy about!</p> <p><b>What is coming up next?</b></p> <p>I’ll finish the reshaping feature for most of the formats which use the common I/O implementation and I'll continue working on the offset feature for FRMS6.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No, but I think I took quite some time to finish HDF5’s offset feature.</p>anandbaburajan@gmail.com (anandbaburajan)Sun, 19 Jul 2020 18:25:10 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-8/GSoC: Week #7https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-7-1/<p>Hi!</p> <p><b>What did you do this week?</b></p> <p>I reworked on the positive offset feature for HDF5 to improve it based on what I have learned from the other formats. HDF5’s I/O implementation is different from the other formats as it uses the h5py library and the dataset’s navigation dimensions can’t be flattened before reading. So it got a bit complicated to reshape the slices with an offset and therefore my implementation often requires reading twice from the dataset. I also fixed an integer overflow bug, refactored and benchmarked ‘get_tiles_straight’ and ‘get_tiles_w_copy’.</p> <p><b>What is coming up next?</b></p> <p>This week’s focus would be to finish my prototype for the general reshaping feature as I only have a simple prototype to handle HDF5 files with a flattened scan dimensions yet. I’ll also complete HDF5’s negative offset feature with and without a ROI this week. I’ll work on the FRMS6 format too this week if possible.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No. The ‘empty’ but ‘colored’ frames issue in the GUI which I mentioned earlier is occurring with HDF5 too along with some other formats. So I think the visualisation code needs some changes to handle missing data (zeros or NaNs).</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 13 Jul 2020 03:51:01 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-7-1/GSoC: Week #6https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-6/<p>Hi!</p> <p><b>What did you do this week?</b></p> <p>I implemented the offset feature in ‘get_tiles_straight’ both with and without a region of interest for raw and ‘memory’ datasets, fixed the bug in ‘get_tiles_w_copy’ which marks the completion of the offset feature for rest of the formats except HDF5 and wrote tests for benchmarking. I spent rest of the time looking into possible ways of implementing the reshaping feature.</p> <p><b>What is coming up next?</b></p> <p>I’ll benchmark the offset feature for the different formats, try to improve it and finish it for HDF5. I’ll continue working towards the general reshaping feature and work on its integration with the UDF interface.</p> <p><b>Did you get stuck anywhere?</b></p> <p>I haven’t figured out how to deal with the K2IS format properly. I’m not sure if I should implement my feature into the existing sync flag, but I’m working on it. In raw files, setting a negative offset results in empty but colored frames in the GUI but not when used with the Python API in a Jupyter notebook.</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 06 Jul 2020 11:08:18 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-6/GSoC: Week #5https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-5/<p>Hi!</p> <p><b>What did you do this week?</b></p> <p>I worked on the API to properly separate additional info for the offset and reshaping features from the dataset parameters. I’m almost finished with the offset feature in the general tile generation function used for most of the formats. Both positive and negative offsets work, both with and without an ROI applied, but I’ve to fix a bug which inserts real frames instead of empty ones when a negative offset is set.</p> <p><b>What is coming up next?</b></p> <p>I’ll fix the bug to complete the feature, make sure the K2IS format works as expected, implement the offset feature for the ‘memory’ dataset for testing and work on the general reshaping feature.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No.</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 29 Jun 2020 06:15:55 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-5/GSoC: Week #4https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-4-1/<p>Hi!</p> <p><b>What did you do this week?</b></p> <p>I found out that for the manual sync feature to work correctly, the partitions need to span the missing frames introduced by setting a negative offset too, along with spanning any missing data at the end. That means only the tiling part should be changed, not the partitions. So I wrote a simple prototype for setting a positive sync offset for the HDF5 format’s get_tiles function. If the approach is good, a similar one can be used for setting an offset for rest of the formats, the difference being the way data is read. I also found out that this way, the K2IS format’s sync flag can be handled too. I also implemented reshaping in the GUI form this week.</p> <p><b>What is coming up next?</b></p> <p>I’ll work on handling the offset with a ROI, make changes to the API to handle additional info for the features and improve the prototypes. I’ll start working on generalizing them for n-D datasets after I get my mentors’ approval.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No.</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 22 Jun 2020 10:26:01 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-4-1/GSoC: Week #3https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-3/<p>Hi!</p> <p><b>What did you do this week?</b></p> <p>Last week’s manual sync feature wasn’t correct as the partitions are supposed to span the missing data and ignore the tiles, not just skip the missing frames. So I fixed that issue this week, wrote some tests accordingly, added some UX improvements and finished implementing the manual sync feature for the MIB dataset. I wrote a prototype for reshaping a 3D dataset with flattened scan dimensions into 4D. It doesn’t actually reshape the data; instead, it reshapes the 4D slice objects into 3D and then reads the data using the slices.</p> <p><b>What is coming up next?</b></p> <p>I’ll implement the manual sync feature with tests for K2IS, HDF5 and the memory format this week. The K2IS format would need some more looking into as it is quite different from the other formats and allows specifying a sync flag. The prototype for reshaping might also need to be improved or completely changed, so I’ll work on that too.</p> <p><b>Did you get stuck anywhere?</b></p> <p>No.</p>anandbaburajan@gmail.com (anandbaburajan)Sun, 14 Jun 2020 19:11:21 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-3/GSoC: Week #2https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-2-1/<p>Hello again!</p> <p><b>What did you do this week?</b></p> <p>I added tests and made some improvements according to the feedback I got on the manual sync feature for SEQ files. I also added some tests for the JS based GUI, and that’s how I was introduced to Jest. Rest of my time was spent on looking into ways to implement the dataset reshaping feature.</p> <p><b>What is coming up next?</b></p> <p>LiberTEM supports a variety of data formats and each of them have their own quirks. So my next challenge is to find the best approach to reshaping which is flexible enough to support all the formats.</p> <p><b>Did you get stuck anywhere?</b></p> <p>Yes, while working on the manual sync feature’s offset validation. I couldn’t figure out how to dynamically change the json schema on the server side. Also, the form implementation includes a nested Formik form field component and the error handling felt a bit off as for some reason, validation error messages weren’t showing at all. So either I’m doing it wrong or I’m not supposed to be doing it. But I learned quite a lot about Formik and I’ll be taking a look into it again after my prototype for reshaping is ready.</p>anandbaburajan@gmail.com (anandbaburajan)Sun, 07 Jun 2020 19:58:12 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-2-1/GSoC: Week #1https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-1/<p>Hello!</p> <p>I’m Anand, a third-year CSE undergraduate at GEC Palakkad and I’m going to be contributing to LiberTEM throughout the summer (and beyond). My project basically involves enhancing LiberTEM’s data pipeline to handle certain use-cases in processing 4D STEM datasets. I began contributing in February and it has been a great experience already as I was exposed to different parts of LiberTEM under the guidance of my mentors.</p> <p><b>What did you do this week?</b></p> <p>I worked on implementing a feature for manual synchronisation of Norpix SEQ files which would allow its users to specify an offset to skip frames or insert blank ones in case of acquisition/synchronisation problems. Apart from my really helpful mentors, I was fortunate to receive feedback from a LiberTEM user, who helped me find bugs and gave suggestions on the feature.</p> <p><b>What is coming up next?</b></p> <p>I’ll be working on handling missing/unfinished data to complete the manual sync feature. Once the feature is bug-free and approved for the SEQ format, I’ll start implementing it for the other data formats. On the side, I’m going to discuss and try writing prototypes to allow reshaping of nD-datasets-which-are-actually-4D.</p> <p><b>Did you get stuck anywhere?</b></p> <p>Yes, a few times. I made a silly mistake in my implementation and in addition to that, I initially misunderstood what ‘skipping’ frames actually meant. Then I came up with a working solution for the feature but as it was specific to just the SEQ format, I went ahead with a better solution my mentor suggested as it allowed the functionality to be shared between other formats too. While working towards the better solution, I got stuck on an error for about a day because I had not gone into detail about the data tiling concept. I finally decided to turn to my mentor who quickly helped me out with the concept with a code example. That’s also when I realized that using LiberTEM’s Python API in a notebook is a better way to try out prototypes than monkey-patching within the GUI.</p> <p>I’m really excited to learn more and keep contributing to LiberTEM! Thanks to my mentors, the LiberTEM community, Python Software Foundation and Google for this opportunity!</p>anandbaburajan@gmail.com (anandbaburajan)Mon, 01 Jun 2020 14:03:41 +0000https://blogs.python-gsoc.org/en/anandbaburajans-blog/gsoc-week-1/