September 08, 2023
For the past six years, I have been working with OCaml, most of this time has been spent writing code at Ahrefs to process a lot of data and show it to users in a way that makes sense.
OCaml is a language designed with types in mind. It took me some time to learn the language, its syntax, and semantics, but once I did, I noticed a significant difference in the way I would write code and colaborate with others.
Maintaining codebases became much easier, regardless of their size. And day-to-day work felt more like having a super pro sidekick that helped me identify issues in the code as I refactored it. This was a very different feeling from what I had experienced with TypeScript and Flow.
Most of the differences, especially those related to the type system, are quite subtle. Therefore, it is not easy to explain them without experiencing them firsthand while working with a real-world codebase.
However, in this post, I will attempt to compare some of the things you can do in OCaml, and explain them from the perspective of a TypeScript developer.
Before every snippet of code, we will provide links like this: (try). These links will go either to the TypeScript playground for TypeScript snippets, or to the Melange playground, for OCaml snippets. Melange is a backend for the OCaml compiler that emits JavaScript.
Without further ado, let's go!
Photo by Bernice Tong on Unsplash
OCaml's syntax is very minimal (and, in my opinion, quite nice once you get used to it), but it is also quite different from the syntax in mainstream languages like JavaScript, C, or Java.
Here is a simple snippet of code in OCaml syntax (try):
let rec range a b =
if a > b then []
else a :: range (a + 1) b
let my_range = range 0 10
OCaml is built on a mathematical foundation called lambda calculus. In lambda calculus, function definitions and applications don't use parentheses. So it was natural to design OCaml with similar syntax to that of lambda calculus.
However, the syntax might be too foreign for someone used to JavaScript. Luckily, there is a way to write OCaml programs using a different syntax which is much closer to the JavaScript one. This syntax is called Reason syntax, and it will make it much easier to get started with OCaml if you are familiar with JavaScript.
Let's translate the example above into Reason syntax (you can translate any OCaml program to Reason syntax from the playground!):
let rec range = (a, b) =>
if (a > b) {
[];
} else {
[a, ...range(a + 1, b)];
};
let myRange = range(0, 10);
This syntax is fully supported throughout the entire OCaml ecosystem, and you can use it to build:
To use Reason syntax, you just need to name your source file with the .re
extension instead of .ml
, and you're good to go.
Since Reason syntax is widely supported and is closer to TypeScript than OCaml syntax, we will use Reason syntax for all code snippets throughout the rest of the article. Although understanding OCaml syntax has some advantages, such as allowing us to understand a larger body of source code, blog posts, and tutorials, there is absolutely no rush to do so, and you can always learn it at any time in the future. If you're curious, we'll provide links to the Melange playground for every snippet, so you can switch syntaxes to see how a Reason program looks in OCaml syntax, or vice versa.
OCaml has great support for data types, which are types that allow values to be contained within them. They are sometimes called algebraic data types (ADTs).
One example is tuples, which can be used to represent a point in a 2-dimensional space (try):
type point = (float, float);
let p1: point = (1.2, 4.3);
One difference with TypeScript is that OCaml tuples are their own type, different from lists or arrays, whereas in TypeScript, tuples are a subtype of arrays.
Let’s see this in practice. This is a valid TypeScript program (try):
let tuple: [string, string] = ["foo", "bar"];
let len = (a: string[]) => a.length;
let u = len(tuple)
Note how the len
function is annotated to take an array of strings as input,
but then we apply it and pass tuple
, which has a type [string, string]
.
In OCaml, this will fail to compile (try):
let tuple: (string, string) = ("foo", "bar");
let len = (a: array(string)) => Array.length(a);
let u = len(tuple)
// ^^^^^
// Error This expression has type (string, string)
// but an expression was expected of type array(string)
Another data type is records. Records are similar to tuples, but each "container" in the type is labeled. (try):
type point = {
x: float,
y: float,
};
let p1: point = {x: 1.2, y: 4.3};
Records are similar to object types in TypeScript, but there are subtle differences in how the type system works with these types. In TypeScript, object types are structural, which means a function that works over an object type can be applied to another object type as long as they share some properties. Here's an example (try):
interface Todo {
title: string;
description: string;
year: number;
}
interface ShorterTodo {
title: string;
description: string;
}
const title = (todo: ShorterTodo) => console.log(todo.title);
const todo: Todo = { title: "foo", description: "bar", year: 2021 }
title(todo)
In OCaml, you have a choice. Record types are nominal, so a function that takes a record type can only take values of that type. Let's look at the same example (try):
type todo = {
title: string,
description: string,
year: int,
};
type shorterTodo = {
title: string,
description: string,
};
let title = (todo: shorterTodo) => Js.log(todo.title);
let todo: todo = {title: "foo", description: "bar", year: 2021};
title(todo);
// ^^^^
// Error This expression has type todo but an expression was expected of
// type shorterTodo
But if we want to use structural types, OCaml objects also offer that option.
Here is an example using Js.t
object types in Melange
(try):
let printTitle = todo => {
Js.log(todo##title);
};
let todo = {"title": "foo", "description": "bar", "year": 2021};
printTitle(todo);
let shorterTodo = {"title": "foo", "description": "bar"};
printTitle(shorterTodo);
To conclude the topic of ADTs, one of the most useful tools in the OCaml toolbox are variants, also known as sum types or tagged unions.
The simplest variants are similar to TypeScript enums (try):
type shape =
| Point
| Circle
| Rectangle;
The individual names of the values of a variant are called constructors in
OCaml. In the example above, the constructors are Point
, Circle
, and
Rectangle
. Constructors in OCaml have a different meaning than the reserved
word
constructor
in JavaScript.
Unlike TypeScript enums, OCaml does not require prefixing variant values with the type name. The type inference system will automatically infer them as long as the type is in scope.
This TypeScript code (try):
enum Shape {
Point,
Circle,
Rectangle
}
let shapes = [
Shape.Point,
Shape.Circle,
Shape.Rectangle,
];
Can be written like (try):
type shape =
| Point
| Circle
| Rectangle;
let shapes = [Point, Circle, Rectangle];
Another difference is that, unlike TypeScript enums, OCaml variants can hold
data for each constructor. Let's improve the shape
type to include more
information about each constructor
(try):
type point = (float, float);
type shape =
| Point(point)
| Circle(point, float) /* center and radius */
| Rect(point, point); /* lower-left and upper-right corners */
Something like this is possible in TypeScript using discriminated unions (try):
type Point = { tag: 'Point'; coords: [number, number] };
type Circle = { tag: 'Circle'; center: [number, number]; radius: number };
type Rect = { tag: 'Rect'; lowerLeft: [number, number]; upperRight: [number, number] };
type Shape = Point | Circle | Rect;
The TypeScript representation is slightly more verbose than the OCaml one, as we
need to use object literals with a tag
property to achieve the same effect. On
top of that, there are greater advantages of variants that we will see just
right next.
Pattern matching is one of the killer features of OCaml, along with the inference engine (which we will discuss in the next section).
Let's take the shape
type we defined in the previous example. Pattern matching
allows us to conditionally act on values of any type in a concise way. For
example
(try):
type point = (float, float);
type shape =
| Point(point)
| Circle(point, float) /* center and radius */
| Rect(point, point); /* lower-left and upper-right corners */
let area = shape =>
switch (shape) {
| Point(_) => 0.0
| Circle(_, r) => Float.pi *. r ** 2.0
| Rect((x1, y1), (x2, y2)) =>
let w = x2 -. x1;
let h = y2 -. y1;
w *. h;
};
Here is the equivalent code in TypeScript (try):
type Point = { tag: 'Point'; coords: [number, number] };
type Circle = { tag: 'Circle'; center: [number, number]; radius: number };
type Rect = { tag: 'Rect'; lowerLeft: [number, number]; upperRight: [number, number] };
type Shape = Point | Circle | Rect;
const area = (shape: Shape): number => {
switch (shape.tag) {
case 'Point':
return 0.0;
case 'Circle':
return Math.PI * Math.pow(shape.radius, 2);
case 'Rect':
const w = shape.upperRight[0] - shape.lowerLeft[0];
const h = shape.upperRight[1] - shape.lowerLeft[1];
return w * h;
default:
// Ensure exhaustive checking, even though this case should never be reached
const exhaustiveCheck: never = shape;
return exhaustiveCheck;
}
};
We can observe how in OCaml, the values inside each constructor can be extracted
directly from each branch of the switch
statement. On the other hand, in
TypeScript, we need to first check the tag, and then access the other properties
of the object. Additionally, ensuring coverage of all cases in TypeScript using
the never
type can be more verbose, and functions may be more error-prone if
we forget to handle it. In OCaml, exhaustiveness is ensured when using variants,
and covering all cases requires no extra effort.
The best thing about pattern matching is that it can be used for anything: basic
types like string
or int
, records, lists, etc.
Here is another example using pattern matching with lists (try):
let rec sumList = lst =>
switch (lst) {
/* Base case: an empty list has a sum of 0. */
| [] => 0
/* Split the list into head and tail. */
| [head, ...tail] =>
/* Recursively sum the tail of the list. */
head + sumList(tail)
};
let numbers = [1, 2, 3, 4, 5];
let result = sumList(numbers);
let () = Js.log(result);
If we wanted to write some identity function in TypeScript, we would do something like (try):
const id: <T>(val: T) => T = val => val
function useId(id: <T>(val: T) => T) {
return [id(10)]
}
While TypeScript generics are very powerful, they lead to really verbose type annotations. As soon as our functions start taking more parameters, or increasing in complexity, the type signatures length increases accordingly.
Plus, the generic annotations have to be carried over to any other functions that compose with the original ones, making maintenance quite cumbersome in some cases.
In OCaml, the type system is based on unification of types. This differs from TypeScript, and allow to infer types for functions (even with generics) without the need of type annotations.
For example, here is how we would write the above snippet in OCaml (try):
let id = value => value;
let useId = id => [id(10)];
The compiler can infer correctly the type of useId
is (int => 'a) =>
list('a)
.
With OCaml, type annotations are optional. But we can still add type annotations anywhere optionally, if we think it will be useful for documentation purposes (try):
let id: 'a => 'a = value => value;
let useId: (int => 'a) => list('a) = id => [id(10)];
I can not emphasize enough how the simplification seen above, which only involves a single function, can affect a codebase with hundreds, or thousands of more complex functions in it.
JavaScript is a language where mutability is pervasive, and working with immutable data structures often require using third party libraries or other complex solutions.
Trying to obtain real immutable values in TypeScript is quite challenging.
Historically, it has been hard to prevent mutation of properties inside objects,
which was mitigated with as const
.
But still, the way the type system has to be flexible to adapt for the dynamism of JavaScript can lead to "leaks" in immutable values.
Let's see an example (try):
interface MutableValue<T> {
value: T;
}
interface ImmutableValue<T> {
readonly value: T;
}
const i: ImmutableValue<string> = { value: "hi" };
const m: MutableValue<string> = i;
m.value = "hah";
As you can see, even when being strict about defining the immutable nature of
the value i
using TypeScript expressiveness, it is fairly easy to mutate
values of that type if they happen to be passed to a function that expects a
type similar in shape, but without the readonly
flag.
In OCaml, immutability is the default, and it's guaranteed. Records are immutable (like tuples, lists, and most basic types), but even if we can define mutable fields in them, something like the previous TypeScript leak is not possible (try):
type immutableValue('a) = {value: 'a}
type mutableValue('a) = {mutable value : 'a}
let i: immutableValue(string) = { value: "hi" };
let m: mutableValue(string) = i;
m.value = "hah";
When trying to assign i
to m
we get an error: This expression has type
immutableValue(string) but an expression was expected of type
mutableValue(string)
.
This might not be as impactful of a feature as the ones we just went through, but it is really nice that in OCaml there is no need to manually import values from other modules.
In TypeScript, to use some function bar
defined in a module located in
../../foo.ts
, we have to write:
import {bar} from "../../foo.ts";
let t = bar();
In OCaml, libraries and modules in your project are all available for your program to use, so we would just write:
let t = Foo.bar()
The compiler will figure out how to find the paths to the module.
Currying is the technique of translating the evaluation of a function that takes multiple arguments into evaluating a sequence of functions, each with a single argument. It is a feature that might be more desirable for those looking into learning more about functional programming.
While it is possible to use currying in TypeScript, but it becomes quite verbose (try):
const mix = (a: string) => (b: string) => b + " " + a;
const beef = mix("soaked in BBQ sauce")("beef");
const carrot = function () {
const f = mix("dip in hummus");
return f("carrot");
}();
In OCaml, all functions are curried by default. This is how a similar code would look like (try):
let mix = (a, b) => b ++ " " ++ a;
let beef = mix("soaked in BBQ sauce", "beef");
let carrot = {
let f = mix("dip in hummus");
f("carrot");
};
One of the best parts of OCaml is how flexible it is in the amount of places your code can run. Your applications written in OCaml can run natively on multiple devices, with very fast starts, as there is no need to start a virtual machine.
The nice thing is that OCaml does not compromise expressiveness or ergonomics to obtain really fast execution times. As this study shows, the language hits a great balance between verbosity (Y axis) and performance (X axis). It provides features like garbage collection or a powerful type system as we have seen, while producing small, fast binaries.
This is not a particular feature of OCaml, as JavaScript has allowed to write applications that run in the server and the client for years. But I want to mention it because with OCaml one can obtain the upsides of sharing the same language across boundaries, together with a precise type system, a fast compiler, and an expressive and consistent functional language.
At Ahrefs, we work with the same language in frontend and backend, including tooling like build system and package manager (we wrote about it here). Having the OCaml compiler know about all our code allows us to support several number of applications and systems with a reasonably sized team, working across different timezones.
I hope you enjoyed the article. If you want to learn more about OCaml as a TypeScript developer I can recommend the Melange documentation site, which has plenty of information about how to get started. This page in particular, Melange for X developers, summarizes some of the things we have discussed, and expanding on others.
If you want to share any feedback or comments, please comment on Twitter, or join the Reason Discord to ask questions or share your progress on any project or idea built with OCaml.