lakshmi97's Blog

Week 12 & Week 13 - August 21, 2023

lakshmi97
Published: 08/23/2023

Finalized experiments using both datasets: Week 12 & Week13

============================================================

 

What I did this week

~~~~~~~~~~~~~~~~~~~~

Monai's VQVAE results on T1-weighted NFBS dataset, 125 samples, for batch_size=5 were qualitatively and quantitatively superior to all previous results. I continued the same experiments on the T1-weighted CC359(Calgary-Campinas-359) public dataset consisting of 359 anatomical MRI volumes of healthy individuals. Preprocessed the data using existing `transform_img` function -

1. skull-strips the volume using the respective mask

2. dipy's `resize` & scipy's `affine_transform` scale the volume to (128,128,128,1) shape

3. MinMax normalization to limit the range of intensities to (0,1)

Using existing training parameters, carried out two experiments, one on CC359 alone & another on both datasets combined. Additionally, I made a slight modification in the loss definition by attributing different weights of 0.5 & 1 to background & foreground pixels compared to equal weights from previous experiments. This resulted in faster convergence as shown in the red, blue & purple lines in the combined plot below-

Combined training plots for all experiments

 

Inference results on the best performing model - B12-both - is as follows-

VQVAE-Monai-B12-both reconstructions & originals showing equally spaced 5 slices for 2 samples

 

This shows that our training not only converged quickly but also improved visually. Here's a comparison of our current best performing model i.e., VQVAE-Monai-B12-both & the previous one i.e., VQVAE-Monai-B5-NFBS. The test reconstruction loss is 0.0013 & 0.0015 respectively.

VQVAE reconstruction comparison for B12-both & B5-NFBS

 

I also carried Diffusion Model training for the bets performing B12-both model for 300 & 500 diffusion steps and the training curve obtained is as follows-

Diffusion Model training plots for 300 & 500 diffusion steps

 

These curves seemed to converge pretty quickly but the sampling outputs in the generation pipeline are still pure noise.

 

What is coming up next week

~~~~~~~~~~~~~~~~~~~~~~~~~~~

Wrapping up documentation & final report

 

Did I get stuck anywhere

~~~~~~~~~~~~~~~~~~~~~~~~

Yes, I carried out debugging to understand the generation pipeline of the Diffusion Model. Cross-checked implementations of posterior mean & variance in the code base with respective formulas from the paper, as well as with MONAI's DDPM implementation. Didn't come across any error, yet the generated samples are erroneous.






 

View Blog Post

Week 10 & Week 11 - August 7, 2023

lakshmi97
Published: 08/23/2023

Carbonate issues, GPU availability, Tensorflow errors: Week 10 & Week 11

=========================================================

 

What I did this week

~~~~~~~~~~~~~~~~~~~~

Recently, I've been an assigned RP(Research Project) account on University of Bloomington's HPC cluster - Carbonate. This account lets me access multiple GPUs for my experiments in a dedicated account.

Once I started configuring my sbatch file accordingly, I started running into issues like GPU access. My debug print statements revealed that I'm accessing 1 CPU despite configuring the sbatch job for more than 1 GPUs. I double checked my dataloader definition, DistributionStrategy, train function. I read through IU's blogs as well as other resources online to see if I'm missing something.

Nothing worked, my mentor eventually asked me to raise a IT request on Carbonate, the IT personnel couldn't help either. This could only mean that Tensorflow is picking upon assigned GPUs. So, on my mentor's suggestion, I loaded an older version of the deeplearning module 2.9.1(used 2.11.1 earlier). This worked!

This also meant using a downgraded version of tensorflow(2.9). This meant I ran into errors again, time taking yet resolvable. I made some architectural changes - replaced GroupNorm with BatchNorm layers, tensor_slices based DataLoader to DataGenerator - to accommodate for the older tensorflow version. Additionally, I also had to change the model structure from a list of layers to ``tensorflow.keras.Sequential`` set of layers with input_shape information defined in the first layer. Without this last change, I ran into ``None`` object errors.

Once all my new code was in place, the week ended, hahahah. And also GPU's were in scarcity in the same week. I'm glad I got some work done though.

 

What Is coming up next week

~~~~~~~~~~~~~~~~~~~~~~~~~~~

Run more experiments!

 

Did I get stuck anywhere

~~~~~~~~~~~~~~~~~~~~~~~~

All I did was get stuck again & again :P

But all is well now.

 

View Blog Post

Week 8 & Week 9 - July, 24, 2023

lakshmi97
Published: 07/28/2023

VQVAE MONAI models & checkerboard artifacts: Week 8 & Week 9
============================================================


What I did this week
~~~~~~~~~~~~~~~~~~~~

We observed in our previous results that the Diffusion Model's performance may depend on better and effective latents from VQVAE. After playing around with convolutional & residual components in the existing architecture that yielded unsatisfactory results, we decided to move to a more proven model on 3D MRIs. It is not necessary that a model that worked well on MNIST dataset would also deliver similarly on 3D MRI datasets, owing to the differences in complexity of the data distributions. Changing the convolutions to 3D filters alone clearly did not do the job.


MONAI is an open source organization for Machine Learning in Medical Imaging, it has repositories and tutorials for various high performing networks tested on multiple Medical Image datasets. We adopted the deep learning architecture for VQVAE from MONAI's PyTorch implementation that was trained & tested on BRATS(400 data elements). The predominant difference is that the encoder & the decoder of VQVAE use Residual units differently than our existing setup. These Residual units are alternated between downsampling/upsampling convolutions in the encoder/decoder. Additionally, MONAI's VectorQuantizer uses non-trainable embeddings with statistical updates(Laplace Smoothing) on them at every iteration.


I implemented MONAI's VQVAE architecture in Tensorflow from scratch, excluding the VectorQuantizer. This architecture has 46.5M trainable parameters. The training objective is to minimize the sum of reconstruction & quantization loss - same training paradigm as our previous experiments. In addition, to address the checkerboard artifacts, I referred to the Sub-Pixel Convolution paper<https: 1707.02937="" abs="" arxiv.org="">. This paper proposes two methods to overcome the deconvolution overlap, a phenomenon that causes checkerboarded outputs in deconvolution/upsampling layers. These two methods are - Sub Pixel Convolution & NN Resize Convolution. For an upsampling rate :math:`r`, Sub Pixel Convolution outputs :math:`3r^2` output channels & later reshuffles channel dimension along spatial dimensions (upsamples them by :math:`r` across each) resulting in 3(desired) output channels. Whereas NN Resize performs interpolation on the kernel to upsample its size by :math:`r` before carrying out convolution that outputs 3 channels. The former method relies on shuffling & the later method relies on nearest neighbor interpolation to obtain an upsampled output respectively. Both methods have shown to perform better qualitatively in dealing with the checkerboards, on random initialization. The authors also go ahead and prove mathematically that with an efficient initialization, both methods prove to be equivalent. They call it the ICNR initialization - Initialization of Convolution with NN Resize.</https:>


I ran multiple experiments with batch_size=5,10,10(with ICNR). The training loss curves obtained are as follows, all of them trained on 1 GPU for 24hrs. We see that all of them converge except the last one(B=10 with ICNR).

 

VQVAE3D Monai training curve

The best training checkpoint has been used to reconstruct test images. Following images depict 2 such reconstructions in 2 rows, where 5 slices from each of these reconstructions have been displayed in columns.


The first one is for B=10, the best training checkpoint had training loss=0.0037. Compared to our previous VQVAE model, we see a better performance in capturing the brain outer structure. Moreover, we don't see white blobs or artifacts as inner matter, rather some curvatures contributing to the inner microstructure of a human brain.

VQVAE3D Monai, B=10


The second one is for B=10 with ICNR kernel initialization, the best training checkpoint had training loss=0.0067. Although, the test results do not look complete. I implemented ICNR through DIPY's resize function to achieve NN resize equivalent output on the kernel filters. This initialization didn't work as it was intended to, further proving that the training is yet to be converged.

VQVAE3D Monai, B=10, ICNR initialization


The next & last image is for B=5, the best training checkpoint had training loss = 0.0031. By far the best one quantitatively as well as visually. The test loss for the below reconstructions is 0.0013. The superior performance of this batch size can be owed to the Batch Normalization(BN) layers in the architecture that calculate mean & average of the batch to perform normalization over all batch elements using these statistics. Having lesser batch size may contribute to least variation in the output of the layer & helps in achieving converging outputs faster. This explanation stems from the concept of Contrastive Learning, where BN layers are used as the source of implicit negative loss learners. Higher the batch size, more implicit negative samples to move away from. Whereas our objective is to minimize the reconstruction loss, having lesser batch size consequently may help in lesser variation.

 

VQVAE3D, B=5


What is coming up next week
~~~~~~~~~~~~~~~~~~~~~~~~~~~

As the next step, I can focus on training the LDM(Latent Diffusion Model) from the best performing model from the above experiments.


Did I get stuck anywhere
~~~~~~~~~~~~~~~~~~~~~~~~

In both weeks, I had issues accessing resources & specifically multiple GPUs.

View Blog Post

Week 6 & Week 7 - July 10, 2023

lakshmi97
Published: 07/28/2023

Diffusion Model results on pre-trained VQVAE latents of NFBS MRI Dataset: Week 6 & Week 7
========================================================================


What I did this week
~~~~~~~~~~~~~~~~~~~~


My current code for VQVAE & DM is well tested on MNIST dataset as shown in the previous blog posts. I extended the current codebase for MRI dataset by using 3D convolutions instead of 2D ones, which resulted in 600k parameters for VQVAE for a downsampling factor f=3. I used a preprocess function to transform MRI volumes to the desired shape (128,128,128,1) through DIPY's reslice and scipy's affine_transform functions, followed by MinMax normalization. I trained the VQVAE architecture for batch_size=10, Adam optimizer's lr=2e-4, 100 epochs. I followed suit for downsampling factor f=2 as well and got the following training curves-

 

VQVAE3D Training Curves

The reconstructed brain volumes on the test dataset on the best performing model are as shown below. As seen in the first image, there are black artifacts in the captured blurry brain structure. Whereas the second image(f=2) does a better job in producing less blurrier brain structure. Nonetheless we only see the outline of the brain being captured with no micro-structural information inside them.

 

VQVAE3D, f=3, reconstructions
 

VQVAE3D, f=2, reconstructions

Later, the 3D Diffusion Model was trained for approximately 200 epochs for 200 & 300 diffusion time steps in two different experiments respectively. The training curves and obtained generations are shown respectively. Both the generations are noisy and don't really have a convincing outlook.

 

DM3D Training curves
.. image:: ./assets/dm3d-training-curves.png
  :width: 400

 

DM3D reconstructions for 200 & 300 diffusion steps

Given the achieved noisy generations, I decided to train VQVAE for a higher number of epochs. This may also indicate that the performance of DM is hitched on good latent representations i.e., a trained encoder capable of perfect reconstructions. So I trained f=3 VQVAE for a higher number of epochs as shown below.

 

VQVAE3D, f=3 further training

The reconstructions obtained on best VQVAE seemed to have produced a better volumetric brain structure. Although, a common theme between all reconstructions is that we see a pixelated output for the last few slices with a checkerboard sort of artifacts. Anyhow, I ran a couple more experiments with a more complex VQVAE model that has residual blocks to carry forward information. None of the reconstructions nor the DM generations have made any progress qualitatively.


What Is coming up next week
~~~~~~~~~~~~~~~~~~~~~~~~~~~


One idea can be working to improve VQVAE's effectiveness by playing around with architecture components and hyper-parameter tuning. Alongside I can also work on looking into checkerboard artifacts seen in the reconstructions.

View Blog Post

Week 5 - June 26, 2023

lakshmi97
Published: 07/28/2023

Carbonate Account Setup, Experiment, Debug and Repeat: Week 5
=============================================================

What I did this week
~~~~~~~~~~~~~~~~~~~~

I finally got my hands on IU's HPC - Carbonate & Big Red 200. I quickly set up a virtual remote connection to Carbonate's Slate on VS Code with Jong's help. Later, I started looking up on Interactive jobs on Carbonate to have GPUs on the go for coding and testing. I spent a ton of time reading up on Carbonate's Interactive SLURM jobs information. Using X11 forwarding, I was able to spin up an interactive job inside the login node using command prompt. It popped up a Firefox browser window from the login node ending up slow and not very user friendly. Same goes for the Big Red 200 as well. Eventually my efforts were in vain and I resorted to installing a jupyter notebook server on my home directory. Although I can't request a GPU with this notebook, it allows me to debug syntax errors, output visualization, plotting loss values etc.


Continuing on my MNIST experiments, I ran into Multi Distribution issues while training the unconditional Diffusion Model(DM). Without getting into too many details I can summarize that having a custom train_step function in tensorflow, without any default loss reduction such as *tf.reduce_mean* or *tf.keras.losses.Reduction.SUM*, requires more work than *model.fit()*. So, my current loss function used for training DM is reduced on the last channel while the rest of the shape of each batch is kept intact. When using distributed training, tensorflow requires the user to take care of gradient accumulation if it's an unreduced loss. So, I tried to learn from Tensorflow tutorials. Alas, all their multi distributed strategy examples were based on functional API models whereas my approach is based on object oriented implementation. This led to design issues. For the sake of time management, I did a little bit of tweaking. While compiling the model under *tf.distribute.MirroredStrategy*, I passed *tf.keras.losses.Reduction.SUM* parameter to the loss function and divided the loss by a pre-decided factor which is *np.prod(out.shape[:-1])* i.e., number of elements in the output shape excluding the last channel which is reduced in the loss function. This tweak worked and also does not have any unexpected impacts on the architecture as well as the training paradigm.


I followed the architecture described in my previous blog for the DM. I trained this on VQ-VAE latents of MNIST dataset for 200 diffusion steps, 2 Nvidia V100 GPUs, Adam Optimizer with 2e-4 learning rate, 200 batch size per GPU for 100+ epochs. For the generative process, I denoised random samples for 50, 100 and 200 steps on the best performing model(112 epochs). Here are the results I achieved -

DM-MNIST-112Epoch

We see some resemblance of digit shapes in the generated outputs. On further training for 300 diffusion timesteps for the best performing model( 108 epochs) with least training loss, the visuals have improved drastically -

DM-MNIST-DDIM300-108epoch


These outputs show the effectiveness of the model architecture, training parameters and the codebase.


What Is coming up next week
~~~~~~~~~~~~~~~~~~~~~~~~~~~


Work on T1 weighted MRI datasets on modified 3D conv code. Hyperparameter tuning for the best results. If time permits, work on the FID evaluation metric.


Did I get stuck anywhere
~~~~~~~~~~~~~~~~~~~~~~~~


Most of the work conducted this week included setting up the environment, debugging, researching documentation. For the rest of the little time, I ran experiments. Having the code ready, both VQ-VAE and DM, before I got hold of GPUs, helped me save a lot of time. This week's work imparted a great learning experience for me.

 

View Blog Post
DJDT

Versions

Time

Settings from gsoc.settings

Headers

Request

SQL queries from 1 connection

Static files (2312 found, 3 used)

Templates (28 rendered)

Cache calls from 1 backend

Signals

Log messages