Project: Generic function support for LPython (https://github.com/lcompilers/lpython)
Organization: Python Software Foundation (LPython)
Mentors: Ondřej Čertík, Gagandeep Singh, Rohit Goswami
Contributor: Luthfan Anshar Lubis
Generics is a common functionality found in statically-typed programming languages to allow easier maintenance of programs with identical implementations but different type signatures. LPython, a statically-typed Python compiler does not yet support this. In this project, we introduce generic functions as functions with type variables; these variables are statically type-checked and made concrete when lower-level code is generated. To allow type checks for generic function calls, restrictions (similar to traits in Rust) are provided to ensure safety of type parameters.
Programs in LPython are statically-typed, requiring type annotations by users. They are first compiled into Abstract Semantic Representation (ASR), an intermediate representation that carries all the syntactic and semantic (including typing) information of the programs. From this IR, the programs are further compiled to lower level languages such as C and LLVM.
Another important feature of LPython is ensuring that programs for LPython can also be run in CPython.
The project will be explained through several phases:
1. Introducing type variables
The first phase of the implementation was adding syntax support for type variable declarations. On the Python level, we rely on
TypeVar provided by Python 3. On the ASR level, I added a new kind of type for expression called
TypeParameter. This allows the declaration of the following function:
T = TypeVar('T') def add(x: T, y: T) -> T: return x + y
The arguments and return value of the function
add are typed with a
TypeParameter on the ASR level.
Simple checks on the type variable declarations are also implemented. First, ensuring that a type variable declaration is unique. Second, ensuring that a function can only be typed with an already declared type variable.
Related blog posts:
Adding type variable support to LPython, link: https://blogs.python-gsoc.org/en/luthfans-blog/adding-type-variable-support-to-lpython-1st-week/
Dealing with conflict and ASR, link: https://blogs.python-gsoc.org/en/luthfans-blog/dealing-with-conflict-and-asr-2nd-week/
2. Implementing generic function instantiation
A big leap from the first phase, I implemented a working generic function instantiation that can be compiled to LLVM and can be run, but still without type checks.
Using the example from the previous phase, it is now possible to have the following code:
T = TypeVar('T') def add(x: T, y: T) -> T: return x + y print(add(1,2)) print(add('a','b'))
The two function calls
add('a','b') instantiate the function
add into two different functions on the ASR level, each with different type signatures. If we were to express it with Python syntax, we would have:
def __lpython_generic_add_0(x: i32, y: i32) -> i32: ... def __lpython_generic_add_1(x: str, y: str) -> str: ...
i32 represents 32-bit integer and
str represents string.
At the same time, the function calls themselves are correspondingly renamed into:
At the function call site, the consistence of the type substitution between the arguments types (
str) with the type variables (
T) is also checked. The function call
add(1,'a') is rejected.
To allow compilation to LLVM, the generic function (
add) is ignored while the instantiated functions (
__lpython_generic_add_0, __lpython_generic_add_1) and renamed functions calls are compiled normally with the existing compilation.
This step particularly took a long time due to several reasons. First, the lack of familiarity with the already existing tools within LPython that can be utilized for function instantiations. Originally I started with a function instantiator that I personally defined, although later on finding out from one of the mentors about an ASR duplicator. Second, we did not start with a concrete design for the ASR grammar. This resulted in a couple iterations on the grammar for generic functions.
Related blog posts:
Instantiating generic functions Into valid functions in ASR, link: https://blogs.python-gsoc.org/en/luthfans-blog/instantiating-generic-functions-into-valid-functions-in-asr-3rd-week/
Revisiting function generation, link: https://blogs.python-gsoc.org/en/luthfans-blog/revisiting-function-generation-4th-week/
Getting the generated ASR, link: https://blogs.python-gsoc.org/en/luthfans-blog/getting-the-generated-asr-for-generic-functions-5th-week/
Completing the generics early prototype, link: https://blogs.python-gsoc.org/en/luthfans-blog/completing-the-generics-early-prototype-6th-week/
3. Adding necessary functionalities and handling ASR changes
I added support for
List types, along with corresponding examples where these types are used.
I also added support for generic
Subfunction (functions without return values). However, during the implementation, the main branch of LPython made a change and combined both
Subfunction, requiring modifications on the generic functions support.
Related blog posts:
Adding more generic examples, link: https://blogs.python-gsoc.org/en/luthfans-blog/adding-more-generic-examples-7th-week/
4. Adding hard-coded restrictions for type variables
The next step is adding a mechanism to type check generic function calls and operations involving values with type variables. Consider again the
add program, there was no check on the operation
x+y against the arguments with type
T. There was also no type check on
add(1,2) to make sure that the arguments
2 can handle the operation inside
As the first approach to this, I added restrictions on the type variable declarations. The restrictions I added were
Divisible. These restrictions are hard-coded in the way that they are hard-coded into the compiler to be associated with specific computations. With restrictions, type variables declarations become:
T = TypeVar('T', bound=SupportsPlus)
SupportsPlus is associated with addition between integer, real, and string. However, it only supports addition between same typed arguments.
With these restrictions, it is possible to check whether the generic operation
x+y is safe by checking the restrictions on the type variables associated with both arguments. It is also possible to check whether
add(1,2) is safe by checking that
2 are typed as integers which does support addition.
Related blog posts:
Next step, adding restriction, link: https://blogs.python-gsoc.org/en/luthfans-blog/next-step-adding-restriction-8th-week/
Working further on restriction type check, link: https://blogs.python-gsoc.org/en/luthfans-blog/working-further-on-restriction-type-check-9th-week/
Extending restriction and handling ASR changes, link: https://blogs.python-gsoc.org/en/luthfans-blog/extending-restriction-and-handling-asr-changes-10th-week/
5. Designing and implementing user-defined restrictions
The above approach to restrictions does handle type check on operations, but limit them to only those hard-coded into the compilers. We require a mechanism that can allow users to freely define restrictions according to need.
To illustrate this, let us consider a different generic function
meanthat takes a list of
T as arguments and calculate the mean value of the elements in the list:
def mean(x: list[T]) -> f64: k: i32 = len(x) if k == 0: return 0.0 res: T res = 0.0 i: i32 for i in range(k): res = res + x[i] return res/k
Several computations are made on the variable
res with type
T. A zero assignment, an addition, and a division. To type check the body of
mean, we can abstract the computations as restrictions, expressed as body-less functions.
@restriction def zero(x: T) -> T: pass @restriction def add(x: T, y: T) -> T: pass @restriction def div(x: T, k: i32) -> f64: pass def mean(x: list[T], **kwargs) -> f64: k: i32 = len(x) if k == 0: return 0.0 res: T res = zero(x) i: i32 for i in range(k): res = plus(res, x[i]) return div(res/k)
Then, the function calls can assign specific functions that will take over the place of these restrictions (
print(mean([1,2], zero=int.__zero__, add=int.__add__, div=int.__div__)) print(mean([1.0,2.0], zero=real.__zero__, add=real.__add__, div=real.__div__))
The assignment between a restriction (eg.
zero) and a function (eg.
int.__zero__) is also checked by making sure if the type substitution remains consistent or not.
https://github.com/lcompilers/lpython/pull/1048 (still open)
Related blog posts:
Designing user-defined restrictions, link: https://blogs.python-gsoc.org/en/luthfans-blog/designing-user-defined-restrictions-11th-week/
I have completed a prototype implementation for LPython. The implementation is strongly type-checked (strong concept) by using restrictions to define computations on type variables. Type-checked generic function calls are compiled into ASR then LLVM.
- More complicated generic functions, such as recursive generic functions and nested generic function calls will require some modifications on both type checks and function instantiations
- Designs can be further simplified to avoid defining similar restrictions