tomasb's Blog

Weekly blog #4 (week 8): 15/07 to 21/07

tomasb
Published: 07/21/2019

Hey. During week 8 I was working mostly on my own, since one of my mentors was on holidays. I did a few things, but on some days productivity or “throughput” just wasn’t good. Let me tell you about it.

 

Starting with text, I trained more text classifiers for manual tests, and opened a PR for the text codebase in ELI5 itself. In the codebase I did some refactoring and fixed broken image tests. I also added character-level explanations, and an option to remove pre-padding in text input.

 

Did I mention “character-level”? Indeed I managed to resolve last blog’s issue of low accuracy for a character-based network. Solution? Increase batch size from 30 to 500. http://theorangeduck.com/page/neural-network-not-working and http://karpathy.github.io/2019/04/25/recipe/ gave some ideas. (This fix was quite random and I got lucky, so I still have the problem of knowing how to train neural nets!)

 

I also trained a network with multiple LSTM layers. Training on CPU, first time things didn’t go past epoch 1. My sequences were way too long, so the solution was to reduce their length. In my case I found that LSTM’s realistically only work for about 100 tokens.

 

An interesting issue regarding refactoring the text codebase was to do with how to handle duplicate function arguments. Specifically we have a dispatcher function and multiple concrete functions. These concrete functions have unique arguments, but also share some. See https://github.com/TeamHG-Memex/eli5/blob/453b0da382db2507972cf31bb25e68dae5674b57/eli5/keras/explain_prediction.py for an example. I ended up not using any **kwargs magic and typed out all the arguments in the dispatcher and concrete functions myself (readable, but need to watch when making changes).

 

I followed through the PyTorch 60 minute blitz tutorial (mostly just copy pasting code and looking up docs for some objects). I must admit that I didn’t understand autograd, and I will need to go through it more carefully for gradient calculations.

 

I went on to load a pretrained model from torchvision and the imagenet dataset (since the dataset is 150 GB, I loaded just a few local samples). I will use this set up when creating Grad-CAM code.

 

Workflow wise, I discovered that using the PYTHONPATH variable, for example running `PYTHONPATH=$PWD/SOMEPATH jupyter notebook` is a convenient way to make sure Python can find your local modules. This is much easier than relocating your scripts or modifying sys.path.

 

Work on first image PR has stopped. Improvements to it are now being done through the second text PR.

 

So what I could’ve done more productivity-wise? I wish I could’ve progressed the text PR more. It is in a very early WIP stage still, with no tests, docs, and sprinkled with TODO’s/FIXME’s. Regarding PyTorch I wish I could’ve coded some experimental Grad-CAM code. Overall it is hard to not get lazy when working without “checks”, like leaving a comment about what you are working on or syncing up in more detail during the week.

 

For next week we have a second evaluation. I heard that I will have to submit PR’s, so it would be good to do two things: open a PyTorch ‘WIP’ PR once I get some Grad-CAM code, and remove the ‘WIP’ label from the text PR.


 

That’s it for the week. Hope the next one is productive!

Tomas Baltrunas

View Blog Post

Weekly check-in #6 (week 7): 08/07 to 14/07

tomasb
Published: 07/14/2019

Hi! At the start of week 7 I have learned that one of my mentors will be going on holiday during the next two weeks. This means that during that time I will mostly work by myself. Excited to see what’s going to happen!

What did you do this week?

Since this was my mentor’s last week before holidays, we had sync up calls on Wednesday and also on Friday. We talked about how to explain RNN’s and character-based models. An interesting addition to the text codebase this week was resizing or “resampling” 1D arrays. I ended up using scipy.signal.

 

For the image PR, another mentor joined and left some comments for improvement, mostly about clarifying docs and outdated information (could this be automated or tested?).

 

Re: Workflow, I did a couple of things on Sunday, such as creating a symlink between my long GSOC path and simply ~/gsoc in my home directory, and removing a commit in the middle of other commits with git rebase.

What is coming up next?

I couldn't get down to doing some PyTorch this week, so I should definitely start with it next week.

 

It would be good to clean up the text branch (fix regressions, add tests, docs, tutorial) and make a second PR. The code should explain at least some models reasonably well, so I’ll need to train more RNN’s (with more than one LSTM layer) and a character-based model (see below for issues) for testing.

 

The image PR has more or less settled down. I will implement any suggestions, and once the PR gets merged I might add commits for a new release.

Did you get stuck anywhere?

The first stumbling block this week was the theory behind RNN’s. I read some articles about them, but I still don’t have an exact idea of how they work. And I didn’t even start with LSTM’s!

 

But the main obstacle this week was training a character-based network. Firstly, the text datasets in Keras (for example IMDB) are made for word tokens, not characters. I had to get the original IMDB texts and build a tokenizer. This took some time but it worked.

 

Training the network on CPU was so slow that I couldn’t get past one epoch! My mentor suggested picking a reasonable max length for the sequences - say the 99th or 95th percentile. He also suggested to train on GPU. Since I'm broke student that can’t afford a GPU with CUDA 3.5, I used Kaggle kernels.

 

The training worked but the network had an accuracy close to 50%. This is an issue yet to be resolved using some neural network troubleshooting techniques.


 

That’s the update of the week. Again I’m excited to work more independently in the next two weeks. Hopefully it won’t end up with no progress and all tests failing, and we’ll get some work done!

 

Tomas Baltrunas

View Blog Post

Weekly blog #3 (week 6): 01/07 to 07/07

tomasb
Published: 07/07/2019

Hello there. It’s time for a weekly blog. One thing to note right away is that half of GSoC is already gone (week 6 is over)! I really do hope to put in some solid work before everything finishes.

 

Let me start by saying what I did this week. Firstly, once again the PR got closer to being finished. Just to mention a few of the interesting changes: with the help of my mentor I added a check, though not perfect, for whether a network outputs ‘scores’ (unconstrained values) or ‘probabilities’ (values between 0 and 1). I also had to make one argument optional for a common object in the library, without breaking too many things and without too many corrections (mypy helped massively). An interesting addition to the Grad-CAM code was checking whether differentiation failed (checking whether gradients had any None values) and making an automated test for that.

 

Outside the PR, the mentors made a new ELI5 release, and I merged those changes into my fork.

 

Regarding text, I managed to add some code to ELI5 to get a working end-to-end text explanation example (trust me, you don’t want to see the code under the hood - a bunch of hardcoding and if statements!). Improvements were made.

 

Through the week I came across a few issues. 

 

For example, at the start of Tuesday, just as I was about to do some productive work, I realised that me or Jupyter Notebook had not saved some of the work that I had done on Monday. Big shout out to IPython %history magic that let me recover the commands that I ran. I’ll CTRL+S all the time from now on!

 

I felt that a big issue this week was a lack of productivity, especially relating to ELI5 text code. Sometimes I found myself making random changes to code, without much planning. I was overwhelmed by how much there is to refactor and implement, and I still have to deal with the problem of reusing existing code from image explanations. A couple of things had helped:

  1. Again make small changes and test. Don’t make changes “into the future” that you can’t test right now.

  2. Create a simple list of TODO items in a text editor, like I did for image code before. This is great for seeing the big picture, taking tasks one-by-one instead of everything at once, and marking tasks as done.

Writing library code - code that interacts with existing code and code that may expose an API to users - is much harder than simple experimental snippets in a Jupyter Notebook. Organisation helps.

 

On Sunday I got down to adding virtualenvwrapper to my workflow. Goodbye “source big-relative-or-absolute-path-to-my-virtualenv/bin/activate” and hello “workon eli5gradcam”! I also tried to train a ConvNet on a multiple classification problem (reuters newstopics), but the model kept on getting an appalling 30% accuracy. Perhaps I need a minor modification to the hyperparameters, or maybe ConvNet’s are not so good for multiple classification and text tasks?

 

I believe that next week will be much the same as week 6, topic wise: small PR changes, library code for text, and experimentation to make the code cover more possible models (RNN’s will come - someday!).

 

Thank you for reading once again, and see you next time!

Tomas Baltrunas

View Blog Post

Weekly check-in #5 (week 5): 24/06 to 30/06

tomasb
Published: 07/01/2019

Hi all. I am running out of ideas for opening sentences, so let me just begin!

What did you do this week?

This week was evaluation, though that was very short and work continued as usual. On Monday and Wednesday I synced up with one of my mentors, discussing how we will incorporate text into existing ELI5 code, our next steps. 

 

However, I wanted to get the PR out of the way, so most of what I did this week was resolving comments and opening new GitHub issues for larger TODO’s and FIXME’s. Finally all the checks went green!

 

Only on Sunday I got to do a bit of work on text, reusing existing ELI5 code to highlight text importance.

What is coming up next?

Now that I have a roughly working “end-to-end” example of text explanations in a Jupyter Notebook, I need to incorporate it into the ELI5 codebase (on a new branch). The example is indeed “roughly working”, so I will need to experiment and debug, and add explanations for more models (RNN’s come to mind).

 

Thinking about the next stage - what to do by the second evaluation - I have two top priority tasks in mind. First, explain text. Second, add PyTorch support alongside Keras. Roughly the same goals as in the proposed schedule, with some differences.

Did you get stuck anywhere?

I found myself working on the PR for too long, and I think it’s harder for me to switch between tasks than to work on a single task. From next week a good resolution would be to focus on text and spend less time on the PR.

 

Remembering all the things I did through the week and putting them on a blog in one day also tended to be hard. My solution was to write every couple of days. This did not happen, but I did manage to write the blog a day earlier (Sunday instead of Monday)!

 

When working on the “end-to-end” text explanation example, I found the ELI5 objects used for HTML formatting a bit mystic. Fortunately there was a tutorial showing highlighted text, and stepping through ELI5 code with pdb and examining the values helped.


 

That’s the check-in for the week. Continue on the work!

Tomas Baltrunas

View Blog Post

Weekly blog #2 (week 4): 17/06 to 23/06

tomasb
Published: 06/25/2019

Hi! Welcome to my second “blog post” in which I will talk about my work, problems, and solutions for week 4. Let’s jump into it...

 

As expected this week was a split between working on my existing Pull Request (Grad-CAM applied to images), and exploring how to apply Grad-CAM to text.

 

Starting with the Pull Request work, I continued resolving various mentors’ comments on it. 

 

At one point I wanted to make a simple test case that checks whether a warning is raised when certain dependencies, when making a certain function call, are missing. My problem was that I spent way too much time on that. Trying to mock dependencies or do tricks with sys.modules is hard. Since the case was quite specialised and rare, the solution was to simply hack up a manual test. Lesson - use your time economically!

 

Another problem that I often met was to do with a Jupyter Notebook tutorial for the docs. Whenever I converted the notebook to .rst, I had to make some manual repetitive changes (changing path URI’s of image links). Thankfully in one of the GitHub comments a mentor pointed out that there is a script in the repo dedicated for such manual notebook conversions. Shout out to “sed” and “mv”!

 

Lastly on the PR, when I was making changes to the Grad-CAM gradient and mean calculator, I had to deal with the concept of “dataflow programming” in Tensorflow (backend of Keras). It was a bit tricky getting used to this, as I sometimes forgot that Tensorflow operations don’t actually “happen” until you call a function to “evaluate” the instructions. After some attempts I got the code to work.

 

Onto the exciting topic of the week - text! I managed to train a bunch of simple text models in a Jupyter Notebook, and then applied the same Grad-CAM steps to the text models as I used for images. In many cases the results were indeed disappointing, but for one case the explanation does work!

 

The first issue with text was padding. In some cases it’s useful to have each text sequence be the same length, padding the text with some “padding token” if it is too short. The problem is what to do with this padding once we get a Grad-CAM heatmap (showing the importance of each token for some prediction). My solution was to simply cut the heatmap at the point where padding starts. This has produced an actual working result, though with edge cases. Another issue yet to be resolved was applying Grad-CAM onto text without padding. My models, trained with padding, but given input without padding, were way off. Padding was a big discussion point with my mentor this week.

 

There are many different model architectures for text, which is another issue. As advised, to not make things more difficult for myself I decided to leave RNN’s, LSTM’s, etc. for later on, sticking to fully connected and Convolutional networks for now. Of course, there are issues, yet to be solved, with such architectures as well. Using a traditional model with Dense layers and a Global Average Pooling layer seemed to give no results at all.

 

So you can see that there were many technical issues regarding text. Things went slowly. Over the weekend I looked at a Deep Learning book to have some background for text, in topics such as tokenization, vectorization, embedding, etc. To train more models faster I also wanted to install Tensorflow for GPU, but my GPU (compute capability 3.0) is too old to match the current Tensorflow requirements (at least compute capability 3.5 for easiest installation). Google Collab and Kaggle kernels here I come!

 

I hope this was not too long and you were able to skim through it. Next week hopefully we’ll see an “end-to-end” example of a prediction about text being explained. I might even finish with the PR (I know that I have been saying this for the last two weeks...). Also, enjoy writing for Evaluation 1. Let us pass!

 

Tomas Baltrunas

View Blog Post