sudharsana-kjl's Blog

Final Weekly Check-in

sudharsana-kjl
Published: 08/26/2019

In the final week of coding, I was refining the hadoop source PR.

What did I do this week?

The dockerfile is finally working now. We are able to set up hadoop using dockerfile. Also the connection set up is well established in the application.

What is coming up next?

There are few bug fixes to be done. Also Hadoop feature and MySQL feature are going to be packaged and uploaded in PyPi similar to the models. I will be working on that as well.

Did you get stuck anywhere?

Fixing the hadoop source connection and writing a data into HDFS stream was an issue. My mentor and I had another meeting this week and we fixed it.

View Blog Post

Blog #6

sudharsana-kjl
Published: 08/22/2019

In the past week, my mentor and I tried to fix the dockerfile that sets up hadoop in a ubuntu container from scratch. Since that was becoming tidious, we tried setting up a mini hadoop cluster.

Apache has this mini mini hadoop cluster set up that gives a single node cluster. I tried building this using a maven docker image. The documentation has very little information on where hadoop is actually getting downloaded and the ports it'll be connecting to by default. My mentor and I debugged the dockerfile and tried to get this up and running but still there is a problem with ports and I'm working on it. Also, we figured out how to get the files from hdfs which can be either CSV or JSON type of files. I have implemented those changes as well.

Hopefully by next week I can finish this project.

View Blog Post

Weekly Check-in #10

sudharsana-kjl
Published: 08/22/2019

In the pat week, I was trying to set up hadoop using Dockerfile.

What did I do this week?

Setting up Hadoop in Docker with my limited knowledge in both is becoming a more difficult task due to the limited resources available on how to particularly set this up over docker using dockerfile. Also everytime I have to build the container from scratch, downloading all the files again and setting it up is a time consuming process. I tried an approach this week that got most of the instructions on the dockerfile working, yet there is an issue with starting the containers. I have addded the corresponding config files that would be used by docker and also a start-up shell script that is run while building the container to start hadoop after installing it. 

What is coming up next?

I need to get the dockerfileworking by this week so that i can move ahead and refine the hadoop source classes and add more tests if possible.

Did you get stuck anywhere?

Debugging the dockerfile was a difficult task for me. My mentor was very understanding and helped me in fixing it.

View Blog Post

Blog #5

sudharsana-kjl
Published: 08/22/2019

In this week I was trying my hands on in setting up hadoop in docker.

The next phase of our project involves making it compatible with input from a hadoop data source. With my limited knowledge in hadoop and docker, I was trying to set it up. First I set it up in my local computer and made it work. I had written the basic classes that will be needed to establish a connection and successfully set up a connection.

I also added config() and args() method that can be used to fetch the arguments and its corresponding values specific to hadoop source. In hadoop, the challenging part is to handle the files from the HDFS. These files can be either CSV or JSON files. So i have to discuss with my mentor about how I can handle this.

View Blog Post

Weekly Check-in #9

sudharsana-kjl
Published: 08/06/2019

In the past week, I was working on making certain changes in my previous PRs and make it ready for the upcoming release.

What did I do this week?

For MySQL source, I wa trying to set up the travis build. This was my first time working with travis CI for this project and it was a good learning experience. Label for CSV Source merge test was done by my mentor, there are certain tests failing now, I was trying to fix that as well. I tried setting up hadoop in docker. Initially my approach was to build a hadoop container and push it to the docker hub so that anyone can fetch from that. After discussing with my mentor, I'm going to set up a Dockerfile so that people can see the commands that are being run and it'll be easier to set up as well.

What is coming up next?

I'll try to set up the Hadoop connection in a week. I have to speed up things a bit so that I can focus on other things to be done once the connection is set up.

Did you get stuck anywhere?

I did get stuck on setting up hadoop. After discussing with my mentor, I have a clear idea on how to approach now.

View Blog Post
DJDT

Versions

Time

Settings from gsoc.settings

Headers

Request

SQL queries from 1 connection

Static files (2312 found, 3 used)

Templates (28 rendered)

Cache calls from 1 backend

Signals

Log messages