I managed to reduce the performance time to ~100 sec from 26 minutes by fetching a bigger chunk of data at a time than was fetched before. Furthermore, I successfully implemented hyperbox for loading matrix, which essentially is a method of subsampling from loading matrix so that the data is well-represented with less sample. Although more loss was induced by this, I'm inclined to conclude that the cost is manageable and the benefit in terms of speeding up the performance outweighs the cost. In fact, since I'm applying approximation PCA method, some degree of loss for this PCA is expected.
What did I do this week?
Improved overall PCA performance
Did I get stuck anywhere?
I was getting errors because I miscalculated the dimension needed for reconstruction
What will I work on next week?
Seems like PCA has almost come to an end (?). Once I clean up the code, produce some example notebooks for reference, with mentor's approval, I will probably begin working on non-negative matrix factorization, if not this week, next week.