GSOC Final Report
After ten weeks of hard work on the project "LLVM Back-end for the Tensor Algebra Compiler" for Google Summer of Code 2021 I was able to develop the following contributions:
- Support for branch and loop operations (IfThenElse, While, For)
- Support for binary and unary operations (Neg, Sub, Div, Rem, Min, Max, And, Or, BitOr, BitAnd)
- Support for comparison operators (LTE, GTE, LT, GT, EQ, NEQ)
- Support for other operations (Assign, Cast, Case, Indices property, Comment, BlankLine)
- Support for booleans
- Added new tests to assess the correctness of new operations
These contributions allow the basic use of TACO with LLVM like arithmetic operations on sparse or dense tensors passed as command-line arguments with llvm flag or using the TACO C++ Library.
The project resulted in 6 Pull Requests and 5 Issues. Five PRs were merged, and one was closed because its content was already been developed by Guilherme, my mentor, on another branch. 3 Issues are still open.
The project can be separated into 3 phases:
- First impressions and branch update
- Implementation of Basic Operations
- Fix old implementation, test writing, and searching for bugs
The first phase was developed in the two first weeks and resulted in the following PRs:
The first phase was to update the llvm-backend
branch that I used on the project with the new alterations on the main, and update the readme with instructions to use TACO with LLVM support using a conda environment.
I opened an issue to inform that the LLVM IR generated by TACO lacks information of the target machine when compared with the IR generated by clang:
This issue was not resolved yet because it doesn't impact the program operation and IR generation.
The second and main phase was developed from the third to the eighth week and resulted in the following PRs:
- Fix 'for' loop: reordering basic blocks of 'For' (not merged)
- Add support for more operations
- Add new Operations
The main implementation was developed during these weeks with the assistance of my mentor Guilherme Leobas. As mentioned before, the first PR wasn't merged because its content was already been implemented on PR #7. The issues that I opened and solved during these weeks were the following:
- Allocate op doesn't create malloc greater than 10000*sizeof(int64)
- TACO doesn't generate properly LLVM IR for float types
The last two weeks of development was used to write new tests, find bugs on corner cases and format the code resulting in the following PRs:
The tests were implemented on a separated folder called "llvm-examples" and on gTest, the framework used by taco to automate tests. The following Issues were opened due to bugs found on these tests:
- TACO doesn't generate an LLVM IR that produces the correct result when specifying just the non-zero values of a tensor
- Lack of support for Yield Operation
These issues are the start point for future improvements on the project.
This experience was excellent, I could learn more about TACO, LLVM, compilers, open-source cooperation, and development. I hope to be able to continue contributing to this project soon.