'Pragmatics of typed intermediate languages

One trend in the compilation is to use typed intermediate languages. Haskell's ghc with its core intermediate language, a variant of System F-omega, is an example of this architecture [ 1 ]. Another is LLVM, which has a typed intermediate language at its core [ 2 ]. The benefit of this approach is that errors in the transformations that make up parts of the code generator can be detected early. In addition, the type information can be used during optimization and code generation.

For efficiency, typed IRs are type-checked, rather than have their type inferred. To make type-checks fast, each variable and each binder carry types for easy type-checking.

However, many transformations in the compiler pipeline may introduce new variables. For example, a normalization transformation K(.) might transform an application

M(N)

into an expression like

let x = K(M) in
let y = K(N) in x(y)

Question. I wonder how compilers handle the issue of giving types to newly introduced variables. Do they re-typecheck, in the example above K(M) and K(N)? Isn't that time-consuming? And does it require passing an environment around? Do they use maps from AST nodes to type information to avoid re-running type checking?


  1. S. Marlow, S. Peyton Jones, The Glasgow Haskell Compiler.

  2. LLVM Language Reference Manual.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source