Articles on DebadityaPal's Bloghttps://blogs.python-gsoc.orgUpdates on different articles published on DebadityaPal's BlogenThu, 26 Aug 2021 10:45:08 +0000Final Blog Post: The Finished Product ⭐https://blogs.python-gsoc.org/en/debadityapals-blog/final-blog-post-the-finished-product/Whenever someone tries to learn a new package, usually the first stop is always the documentation. But the documentation is mostly long and boring to read through. Furthermore, since it is auto-generated, it often contains a plethora of arguments, most of which are internal and of no use to the end-user. Data science is an interactive field by nature, thus, packages for data science should also follow suit. <br><br> Hub is a data optimization package that enables users to stream unlimited amounts of data from the cloud to any machine without sacrificing performance compared to local storage. <br><br> It is an individual package that comes with its own set of APIs. Therefore, if a user wants to incorporate Hub into their project they must first get comfortable with these APIs and understand how to use them. <br><br> The primary reason why users would use libraries and packages in their projects is to simplify the coding process and make it much faster. The packages’ main purpose would be to skip the manual implementation of everything in the codebase. Hence, it only makes sense when I say that learning how to use Hub should be faster than having to finish the project without using Hub. <br><br> The idea of “Learn” is to provide a much more interesting and faster way of learning how to use Hub. The goal is achieved by serving interactive code-along tutorials, much like DataCamp, that the user can take from the comfort of their local terminals. “Learn” comes with a single command that starts the whole course engine and the rest works based on user feedback. <br><br> The way it works is that there exists a course library that contains all of the course content in YAML files. The content is divided into small bits of information we call “Snippets”. The course engine contains a YAML Parser that reads the information from these files and presents the same to the user one snippet at a time. Currently, we have 3 types of snippets to add variety to the way information is presented: <ol> <li>Text Snippet: Purely meant for reading, does not expect user feedback.</li> <li>MCQ Snippet: Poses a multiple choice question for the user and expects them to enter an answer.</li> <li>Code Snippet: Provides a prompt and expects the user to code along.</li> </ol> <br><br> The API has been designed in such a way that more types of Snippets can easily be added. <br><br> Writing new courses is also really simple should a user want to add their courses. A full guide can be found at https://learn-hub.readthedocs.io/en/latest/course.html. In short, it just involves writing one Snippet at a time following a particular format. With this package, learning how to use Hub is a much faster process now, we hope to make it as easy as possible for newcomers to start using Hub. Moreover, the package is completely extendable and community-driven, if you feel like writing a course on a topic, feel free to do so!debaditya.pal6@gmail.com (DebadityaPal)Thu, 26 Aug 2021 10:45:08 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/final-blog-post-the-finished-product/Blog Post #5: Colors 🔴 🟠 🟡 🟢https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-5-colors/This week was all about colors. The standard monochrome of the terminal seems somewhat boring, so to make the course more palatable I wanted to add different colors. An added benefit of colors is that it adds structure and heirarchy to the snippets. Different areas are highlighted differently making the whole thing easier to read and digest. <br> <img src="https://drive.google.com/uc?export=view&amp;id=1nOzZISIpKrrZ4zdWRCwLveyAx4GgzohV" style="display: block; width: 100%;"> <br> So the question is, how do we color text in the console? The way I did it was to use `colorama`, a python package that does just this. The syntax and API for colorama is fairly simple. One just has to add the colour they want to the print statement, so that is what I did. However, this wasn't working on command prompt. For some reason CMD was printing a weird code instead of the color. <br><br> So, I turned to my mentor to gain some insight as to what is going on. He recommended I use the init funtion in colorama. After doing so, it started working! The issue was with escape sequences. For some reason CMD can't handle them well enough and ends up printing them. The init function handles those errors.debaditya.pal6@gmail.com (DebadityaPal)Sat, 14 Aug 2021 07:04:40 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-5-colors/Weekly Check-In #5: Courses and Feedback 👨🏼‍💻https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-5-courses-and-feedback/<h1>What did I do this week?</h1> This week was spent away from the IDE. Majority of it was spend on Google Docs, I wrote the 2 courses that I wanted to and shared the links with my mentor and the devs at Activeloop. Everyone was super helpful and gave a lot of feedback which shaped the way the courses turned out to be. It was an iterative process but a necessary one. The main goal of the week was to write 2 courses, one on the basic topics of Hub and the other on the more advanced topic of parallel computation. <h1>What is coming next?</h1> This week essentially completes my project, so next week I will be working on a stretch goal. The idea is to introduce colors to the courses so that the interface is easier on the eyes for the user. Moreover, there is some documentation that needs to be written and I will try to optimize the whole project as well. <h1>Did I get stuck anywhere?</h1> This week was not technically challenging at all. It just involved writing courses and incorporating the feedback. So I did not get stuck anywhere.debaditya.pal6@gmail.com (DebadityaPal)Sat, 07 Aug 2021 08:10:24 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-5-courses-and-feedback/Blog Post #4: Course Writing Begins 📝https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-4-course-writing-begins/The last week was a bit intense as it involved learning a lot of topics really fast. Since Code Snippets are the star of my project, I wanted to settle for nothing less than perfection. But now that is has been successfully implemented I can take a sigh of relief. The main technical part of my project is over. I have a fully functional course engine armed with 3 types of snippets ready to be deployed. <br><br> Now begins the second phase of my project viz. Course Writing. I have divided the entirety of Hub's features into two courses. The first one would cover the basics of Hub and enable everyone to use all of its main features like dataset accessing and uploading along with links to the visualization platform. Whereas the second one would cover the more advanced topic of parallel computing and how to use it to speed up the overall performance. <br><br> I have decided to write the basic script for my courses on Google Docs and share it with multiple people on the core team of Hub. Feedback is going to be crucial for this step as it will allow me to shape the course content much better.debaditya.pal6@gmail.com (DebadityaPal)Tue, 03 Aug 2021 08:11:50 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-4-course-writing-begins/Weekly Check-In #4: ASTs and Code 💻https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-4-asts-and-code/<h1>What did I do this week?</h1> This week was spent on the implementation of Code Type Snippets which is probably the most important part of my entire project. There were a lot of roadblocks but it has finally been implemented. The solution I used was to hack Python's <code>code.InteractiveConsole</code> and tweak it to my liking and required utility. <br><br> The output of the interactive console was then parsed into an AST (Abstract Syntax Tree). I decided to go with ASTs instead of string matching because, same code can be written in different ways, and ASTs would allow the user to do that and be more robust. <br><br> <code>code.InteractiveConsole</code> emulated a REPL, I had to hack it to stop it from running endlessly, so that I could evaluate the code from the user input after every step. <h1>What is coming next?</h1> Next I will be working on Creating the different courses for Hub. The main technical part of the project is over. I would also write the necessary documentation and add the required code optimizations. But the most important job at hand right now is to talk to my mentor, and the community in general to figure out what kind of course content they would prefer. <h1>Did I get stuck anywhere?</h1> There were many areas where I got stuck. The biggest one being, I was not being able to import modules into my custom InteractiveConsole. I tried to tweak things around to get it to work but absolutely couldn't. So I immediately approached my mentor, David. He responded with a code snippet of the actual <code>code.InteractiveConsole</code> where he got it to work. From, there I dissected my code and figured out the part which was throwing errors, and accordingly changed it. To my great surprise the error was coming from a deepcopy function, so I chucked it out and manually did the deepcopy, myself.debaditya.pal6@gmail.com (DebadityaPal)Thu, 22 Jul 2021 08:29:16 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-4-asts-and-code/Blog Post #3: Code Type Snippets 👨‍💻https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-3-code-type-snippets/So we have just passed the halfway mark on our GSoC journey and a considerable amount of my project is over. However, now is not the time to rest. Now comes the most important part of my project, i.e: Code Type Snippets. It is this feature that is going to encourage and entertain users the most so that they keep on using my package. <br> <img src="https://drive.google.com/uc?export=view&amp;id=10O75Z81od7g6hR9xdgiYWDZXZ4Iq1cxj" style="display: block; margin: 20px auto 20px auto;"> <br> The idea is to let the users write Python code during runtime while a course is being served, and check that code for correctness. So in a manner my program will have another program running inside of it. Sort of like a shell inside a shell. Shellception! but how would we do this? <br><br> Turns out Python already has an inbuilt module for it called code. So this code module has a few classes like the InteractiveConsole class that closesly emulates the actual interactive Python Shell experience and takes in user code. This code is stored as a string, we have the <code>InteractiveInterpreter.runcode()</code> function for running this code and generating the variables dynamically at runtime. <br><br> So for now I have planned to use these for the CodeType snippets as I can use a simple string matching to check for code correctness, and later run the code to keep the variables ready for the next Code Snippet within the same chapter.debaditya.pal6@gmail.com (DebadityaPal)Thu, 15 Jul 2021 07:20:40 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-3-code-type-snippets/Weekly Check-In #3: MCQ Snippet ✔️https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-3-mcq-snippet/<h1>What did I do this week?</h1> This week was spent implementing a specific type of Snippet viz, the MCQ Snippet. This is a very important subtype as it is currently the only way to check retention of information from the user's part. The YAML structure for this snippet had to be made simple so that anyone can use it irrespective of their technical know-how. The current YAML structure looks like this: <br> <img src="https://drive.google.com/uc?export=view&amp;id=1mAFrpL8pOmzY6_eQJcvIWl2IIkhENnH3" style="display: block; margin: 20px auto 20px auto;"> <br> The keys are self explanatory. To properly check for retention I decided to randomly shuffle the options every time the snippet is parsed. The Hint will be used after every wrong answer given by the user. <h1>What is coming next?</h1> Since 2 out of the 3 proposed Snippets have been added. I will be working on the 3rd one next, i.e.: The Code Snippet. This one is by far going to be the most complicated snippet as it involves evaluating python code written during runtime. I'll take a look at swirl to see how they have achieved this with R, and maybe look into the "code" module in Python. <h1>Did I get stuck anywhere?</h1> The MCQ snippet was pretty easy to implement, since it's a non console snippet and the abstract classes defined during the previous weeks already had the implementations of the basic functions. So I did not get stuck anywhere during this week.debaditya.pal6@gmail.com (DebadityaPal)Wed, 07 Jul 2021 07:50:18 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-3-mcq-snippet/Blog Post #2: Documentation 📕https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-2-documentation/This week was mainly spent on writing docstrings to improve the overall documentation of the project. My goal was to get the API reference up to date with the work I've done and get it hosted via ReadTheDocs. However, there were a couple of hiccups on the way, hence I am devoting this blog post to ReadTheDocs. <br><br> ReadTheDocs makes parsing docstrings and hosting documentation a lot easier than having to do it manually, however their documentation is a bit unfriendly towards first time users. Like if you want to use an Automatic Docstring parser to parse docstrings, a user will have to install <b>recommonmark</b> but a simple pip install wont work. After some docs hunting I found out that one needed to add this function to the configuration file. <br> <code> <pre>from recommonmark.transform import AutoStructify ... github_doc_root = "https://github.com/rtfd/recommonmark/tree/master/doc/" def setup(app): app.add_config_value( "recommonmark_config", { "url_resolver": lambda url: github_doc_root + url, "auto_toc_tree_section": "Contents", }, True, ) app.add_transform(AutoStructify) </pre> </code> <br> I hope this code snippet helps others in automating their documentation pipeline. I am also sure that the authors of these packages will fix these issues within the near future. Long live Open Source Software.debaditya.pal6@gmail.com (DebadityaPal)Wed, 30 Jun 2021 06:51:30 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-2-documentation/Weekly Check-In #2: MetaClasses and OOP 🤖https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-2-metaclasses-and-oop/<h1> What did I do this week? </h1> This week was spent creating the various classes and subclasses for the Course Engine. I finished writing all the different functions and utility functions that perform the following tasks: <ul> <li>Recursively Load Data from Yaml files</li> <li>Verify the Chapters for missing fields before serving it</li> <li>Serve the chapters on the CLI</li> </ul> I used python's metaclass concept to make "Snippet" a meta class for the different types of snippets that will be implemented in the following weeks. The purpose of this is to ensure implementation of certain functions in the inherited child classes. <h1> What is coming next? </h1> Over the course of the next week, I will be trying to improve the documentation of my codebase and introduce unit tests to ensure a robust system. Once that is done, I will be implementing specific types of Snippets using the base class as the parent. The first in the list is the TextSnippet followed by MCQSnippet and CodeSnippet. <h1> Did I get stuck anywhere? </h1> This week was spent with the basics of Object Oriented Programming (OOP) hence I didn't get stuck anywhere, however the concept of a metaclass was new to me, so I had to read some documentation and tinker around with it before implementing it in my project.debaditya.pal6@gmail.com (DebadityaPal)Wed, 23 Jun 2021 07:05:48 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-2-metaclasses-and-oop/Blog Post #1: The Skeletal Framework ⚙️https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-1-the-skeletal-framework/During this week, I analyzed the flow of data in my project and modelled the classes accordingly. The course data is expected to be stored in a YAML file, hence the first thing that I needed to implement was a YAML parser. Thankfully, Python has a package aptly named <code>yaml</code> for that specific purpose. The course file would have the following structure: <br> <img src="https://drive.google.com/uc?export=view&amp;id=1qfv6ji_V44u_wJSgsuCviPCG5X2wwjcA" style="display: block; margin: 20px auto 20px auto;"> <br> Now once we have the raw data from the YAML file, we need to give it some structure for ease of computation. I decided to have the following heirarchical structure. <ul> <li>Course</li> <ul> <li>Chapter 1</li> <ul> <li>Snippet 1</li> <li>Snippet 2</li> <li>Snippet 3</li> </ul> <li>Chapter 2</li> <ul> <li>Snippet 1</li> <li>Snippet 2</li> </ul> </ul> </ul> <br> Once this was decided, I went on to create the basic classes for a Course, Chapter and Snippet. My plan is to keep these classes as base classes so that in future if more features are to be added, the newer classes can simply inherit the base functionalities from these classes. However these classes will also need to have a few restrictions like, some mandatory fields of data, some mandatory functions being implemented for all subclasses that inherit this class and so on. For the latter I aim to explore metaclasses in Python. The metaclass can be used to control how a certain class behaves.debaditya.pal6@gmail.com (DebadityaPal)Wed, 16 Jun 2021 07:19:46 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/blog-post-1-the-skeletal-framework/Weekly Check-In #1: Setting Up 🌎https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-1-setting-up/<h2>What did I do this week?</h2> I used the community bonding time to set up the documentation pipeline for my project. I decided on using <a href="https://readthedocs.org/">ReadTheDocs</a>, because who doesn't love free hosting and automated documentation. Since it integrates directly with Github Repositories and rebuilds itself automatically after every push, the entire process of keeping the documentation up to date is simplified. <h2>What is coming next?</h2> Now the main crux of my project begins, ie: Creating a Course Engine that can parse data files and serve courses over the terminal. I think I'll have to study how Python REPL works, that will be instrumental to this project. A package called <a>Swirl</a> has already done this in R. Maybe I'll take a look at them as well. <h2>Did I get stuck anywhere?</h2> ReadTheDocs was supposed to be simple, right? Well, it wasn't. Even after following their documentation to a T, the docs build was failing due to a Markdown module not found error. The module is called "myst-parser". so I had to dive deeper and try to understand how RTD handles module imports, I set up a configuration file and explicitly mentioned "myst-parser" as a dependency. It worked like a charm after that. Maybe I'll even open an issue in their repo regarding this issue.debaditya.pal6@gmail.com (DebadityaPal)Tue, 08 Jun 2021 08:01:29 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/weekly-check-in-1-setting-up/Coding Phase Starts! 🚀https://blogs.python-gsoc.org/en/debadityapals-blog/coding-phase-starts/Hello there! I am Debaditya and I am working a project that is very close to my heart. Do you remember the not-so good old days when you just started exploring Open Source Development? Treading through huge codebases one file at a time, trying to understand what is going on, fixing one bug only to get bogged down by 5 more. Yeah, that was me 2 months ago 😅. <br> <br> But since then, all the folks at <a href="https://www.activeloop.ai/">Activeloop</a> have helped me out quite a bit and today here we are! So I wanted to create something that would help the OSS Padawans who come after me. If asked "Do you like to read documentation?", I'm sure not more than 2 or 3 people in a room would say "Yes!". What if you could have short interactive code-along tutorials to teach the basics of a package? Suddenly the hands start going up. To throw in a cherry on top, let's say that these tutorials would be served at the comfortable local terminal of your system! <br> <br> So that is my project. I will be creating a interactive onboarding environment for all those who want to learn about Activeloop's package: <a href="https://github.com/activeloopai/Hub">Hub</a> <br> <br> This concludes my first blog post, but don't worry, I'll be back in a few days!debaditya.pal6@gmail.com (DebadityaPal)Tue, 08 Jun 2021 07:14:35 +0000https://blogs.python-gsoc.org/en/debadityapals-blog/coding-phase-starts/