In this week I was trying my hands on in setting up hadoop in docker.
The next phase of our project involves making it compatible with input from a hadoop data source. With my limited knowledge in hadoop and docker, I was trying to set it up. First I set it up in my local computer and made it work. I had written the basic classes that will be needed to establish a connection and successfully set up a connection.
I also added config() and args() method that can be used to fetch the arguments and its corresponding values specific to hadoop source. In hadoop, the challenging part is to handle the files from the HDFS. These files can be either CSV or JSON files. So i have to discuss with my mentor about how I can handle this.