'In OCaml, what is the difference between `'a.` and `type a.` and when to use each?

OCaml has several different syntaxes for a polymorphic type annotation :

let f :         'a -> 'a = … (* Isn’t this one already polymorphic? (answer: NO) *)
let f : 'a.     'a -> 'a = …
let f : type a.  a ->  a = …

We often see them when using fancy algebraic datatypes (typically, GADTs), where they seem to be necessary.

What is the difference between these syntaxes? When and why each one must be used?



Solution 1:[1]

I will use the following code (drawn from that other question) as a running example. Here, the type annotation on the definition of reduce is actually required to make it typecheck.

(* The type [('a, 'c) fun_chain] represents a chainable list of functions, i.e.
 * such that the output type of one function is the input type of the next one;
 * ['a] and ['c] are the input and output types of the whole chain.
 * This is implemented as a recursive GADT (generalized algebraic data type). *)
type (_, _) fun_chain =
  | Nil  : ('a, 'a) fun_chain
  | Cons : ('a -> 'b) * ('b, 'c) fun_chain -> ('a, 'c) fun_chain

(* [reduce] reduces a chain to just one function by composing all
 * functions of the chain. *)
let rec reduce : type a c. (a, c) fun_chain -> a -> c =
  fun chain x ->
    begin match chain with
    | Nil              -> x
    | Cons (f, chain') -> reduce chain' (f x)
    end

The very short story

On let-definitions, an annotation like : 'a -> 'a doesn’t force polymorphism (the syntax is misleading, because the same annotation on a val-declaration i.e. in a module signature does enforce polymorphism).

: type a. … is a type annotation with explicit (forced) polymorphism. You can think of this as the universal quantifier: ? a, …. In the code above we make use of advanced features of the type system, namely, polymorphic recursion and branches with different types, and that leaves type inference at a loss. In order to have a program typecheck in such a situation, we need to force polymorphism like this.

The pragmatic story

: type a. … is a short-hand syntax that combines two features:

  1. an explicitly polymorphic annotation : 'a. …
    • useful for ensuring a definition is as general as intended
    • required when recursion is done with type parameters different from those of the initial call (“polymorphic recursion” i.e. recursion on “non-regular” ADTs)
  2. a locally abstract type (type a) …
    • required when different branches have different types (i.e. when pattern-matching on “generalized” ADTs)
    • allows you to refer to type a from inside the definition, typically when building a first-class module (I won’t say more about this)

Here we use the combined syntax because our definition of reduce falls under both situations in bold.

  1. We have polymorphic recursion because Cons builds a (a, c) fun_chain from a (b, c) fun_chain: the first type parameter differs (we say that fun_chain is a “non-regular” ADT).

  2. We have branches with different types because Nil builds a (a, a) fun_chain whereas Cons builds a (a, c) fun_chain (we say that fun_chain is a “generalized” ADT, or GADT for short).

Just to be clear: : 'a. … and : type a. … produce the same signature for the definition. Choosing one syntax or the other only has an influence on how its body is typechecked. For most intents and purposes, you can forget about : 'a. … and just remember the combined form : type a. … (hence my very short story). Alas, the latter does not completely subsume the former, there are rare situations where writing : type a. … wouldn’t work and you would need : 'a. … (see @octachron’s answer) but, hopefully, you won’t stumble upon them often.

The long story

Explicit polymorphism

OCaml type annotations have a dirty little secret: writing let f : 'a -> 'a = … doesn’t force f to be polymorphic in 'a. The compiler unifies the provided annotation with the inferred type and is free to instantiate the type variable 'a while doing so, leading to a less general type than intended. For instance let f : 'a -> 'a = fun x -> x+1 is an accepted program and leads to val f : int -> int. To ensure the function is indeed polymorphic (i.e. to have the compiler reject the definition if it is not general enough), you have to make the polymorphism explicit, with the following syntax:

let f : 'a. 'a -> 'a = …

For a non-recursive definition, this is merely the human programmer adding a constraint which makes more programs be rejected.

In the case of a recursive definition however, this has another implication. When typechecking the body, the compiler will unify the provided type with the types of all occurrences of the function being defined. Type variables which are not marked as polymorphic will be made equal in all recursive calls. But polymorphic recursion is precisely when we recurse with differing type parameters; without explicit polymorphism, that would either fail or infer a less general type than intended. To make it work, we explicitly mark which type variables should be polymorphic.

Note that there is a good reason why OCaml cannot typecheck polymorphic recursion on its own: there is undecidability around the corner (see Wikipedia for references).

As an example, let’s do the job of the typechecker on this faulty definition, where polymorphism is not made explicit:

(* does not typecheck! *)
let rec reduce : ('a, 'c) fun_chain -> 'a -> 'c =
  fun chain x ->
    begin match chain with
    | Nil              -> x
    | Cons (f, chain') -> reduce chain' (f x)
    end

We start with reduce : ('a, 'c) fun_chain -> 'a -> 'c and chain : ('a, 'c) fun_chain for some type variables 'a and 'c.

  • In the first branch, chain = Nil, so we learn that in fact chain : ('c, 'c) fun_chain and 'a == 'c. (That doesn’t matter right now, though.)

  • In the second branch, chain = Cons (f, chain') so there exists an arbitrary type b such that f : 'a -> b and chain' : (b, 'c) fun_chain. Then we must typecheck the recursive call reduce chain', so the expected argument type ('a, 'c) fun_chain must unify with the provided argument type (b, 'c) fun_chain; but nothing tells us that b =?= 'a. So we reject this definition, preferably with a cryptic error message:

Error: This expression has type ($Cons_'b, 'c) fun_chain
       but an expression was expected of type ('c, 'c) fun_chain
       The type constructor $Cons_'b would escape its scope

If now we make polymorphism explicit:

(* still does not typecheck! *)
let rec reduce : 'a 'c. ('a, 'c) fun_chain -> 'a -> 'c =
  …

Then typechecking the recursive call is not a problem anymore, because we now know that reduce is polymorphic with two “type parameters” (non-standard terminology), and these type parameters are instantiated independently at each occurrence of reduce; the recursive call uses b and 'c even though the enclosing call uses 'a and 'c.

Locally abstract types

But we have a second problem: the other branch, for constructor Nil, has made 'a be unified with 'c. Hence we end up inferring a less general type than what the annotation mandated, and we report an error:

Error: This definition has type 'c. ('c, 'c) fun_chain -> 'c -> 'c
       which is less general than 'a 'c. ('a, 'c) fun_chain -> 'a -> 'c

The solution is to turn the type variables into locally abstract types, which cannot be unified (but we can still have type equations about them). That way, type equations are derived locally to each branch and they do not transpire outside of the match with construct.

Solution 2:[2]

The practical answer when hesitating between 'a . ... and type a. ... is to always use the latter form:

  • type a. ... works with:
    • polymorphic recursion
    • GADTs
    • raise type errors early

whereas:

  • 'a. ... works with
    • polymorphic recursion
    • polymorphic quantification over row type variables

Thus type a. ... is generally a strictly superior version of 'a . ... .

Except for the last strange point. For the sake exhaustiveness, let me give an example of quantification over a row type variable:

let f: 'a. ([> `X ] as 'a) -> unit = function
  | `X -> ()
  | _ -> ()

Here the universal quantification allows us to control precisely the row variable type. For instance,

let f: 'a. ([> `X ] as 'a) -> unit = function
  | `X | `Y -> ()
  | _ -> ()

yields the following error

Error: This pattern matches values of type [? `Y ]
      but a pattern was expected which matches values of type [> `X ]
      The second variant type is bound to the universal type variable 'a,
      it may not allow the tag(s) `Y

This use case is not supported by the form type a. ... mostly because the interaction of locally abstract type, GADTs type refinement and type constraints has not been formalized. Thus this second exotic use case is not supported.

Solution 3:[3]

TL;DR; In your question, only the last two forms are polymorphic type annotations. The latter of these two forms, in addition to annotating a type as polymorphic, introduces a locally abstract type1. This is the only difference.

The longer story

Now let's speak a little bit about the terminology. The following is not a type annotation (or, more properly, doesn't contain any type annotations),

let f :         'a -> 'a = …

It is called a type constraint. A type constraint requires the type of the defined value to be compatible with the specified type schemata.

In this definition,

let f : 'a.     'a -> 'a = …

we have a type constraint that includes a type annotation. The phrase "type annotation" in OCaml parlance means: annotating a type with some information, i.e., attaching some attribute or a property to a type. In this case, we annotate type 'a as polymorphic. We're not annotating the value f as polymorphic neither are we annotating the value f with type 'a -> 'a or 'a. 'a -> 'a. We are constraining the value of f to be compatible with type 'a -> 'a and annotate 'a as a polymorphic type variable.

For a long time, syntax 'a. was the only way to annotate type as polymorphic, but later OCaml introduced locally abstract types. They have the following syntax, which you could also add to your collection.

let f (type t) : t -> t = ...

Which creates a fresh abstract type constructor that you can use in the scope of the definition. It doesn't annotate t as polymorphic though, so if you want it to be explicitly annotated as polymorphic you could write,

let f : 'a. 'a -> 'a = fun (type t) (x : t) : t -> ...

which includes both an explicit type annotation of 'a as polymorphic and the introduction of a locally abstract type. Needless to say, it is cumbersome to write such constructions, so a little bit later (OCaml 4.00) they introduced syntactic sugar for that so that the above expression could be written as simple as,

let f : type t. t -> t = ...

Therefore, this syntax is just an amalgamation of two rather orthogonal features: locally abstract types and explicitly polymorphic types.

It is not however that the result of this amalgamation is stronger than its parts. It is more like an intersection. Whilst the generated type is both locally abstract and polymorphic, it is constrained to be a ground type. In other words, it constrains the kind of the type, but this is a completely different problem of higher-kinded polymorphism.

And to conclude the story, despite the syntax similarities, the following is not a type annotation,

val f : 'a -> 'a

It is called a value specification, which is a part of a signature, and it denotes that the value f has type 'a -> 'a.


1)) Locally abstract types have two main use cases. First, you can use them inside your expression in places where type variables are not permitted, e.g., in modules and exceptions definitions. Second, the scope of the locally abstract type exceeds the scope of the function, which you can employ by unifying types that are local to your expression with the abstract type to extend their scopes. The underlying idea is that the expression can not outlive its type and since in OCaml types could be created in runtime we have to be careful with the extent of the type as well. Unifying a locally created type with a locally abstract type via a function parameter guarantees that this type will be unified with some existing type in the place of the function application. Intuitively, it is like passing a reference for a type, so that the type could be returned from the function.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 octachron
Solution 3