Home > Articles > Software Development & Management

What Language I Use for… Building Scalable Servers: Erlang

  • Print
  • + Share This
David Chisnall tells you why his language of choice for building scalable servers is Erlang.
From the author of

I first felt the real power of Erlang during my PhD, when I wrote a server for remote paging with aggressive prefetching on my (single-core) laptop and then deployed it on a 64-processor SGI machine and saw it scale smoothly up to the available processors.

It's quite easy to write code in C that has a handful of threads. It's even easy to have a few dozen threads if they're largely independent. It's incredibly hard to write C code that has a few thousand threads, all communicating closely, and without insane overheads from contention when you try scaling it out beyond about eight cores.

Erlang is the only language where I not only have managed to write code that scaled like this but also where I have the code that did it and worked the first time.

Erlang Processes

The basic unit of Erlang code is a process. These are very lightweight. Allocating a process in Erlang takes about the same amount of overhead as allocating an object in another language. You can spawn very short-lived processes for asynchronous events and then let them exit on completion, or you can manage pools of threads that distribute work and spawn more as required.

Erlang processes are not the same as processes in the host operating system. They are scheduled by the Erlang VM, in the same way that Java threads are not necessarily threads in the host operating system. They're called processes, not threads, because Erlang has a shared-nothing model.

As I've said before, if you want to write scalable, maintainable, parallel code, there is one rule that you must abide by: No data may be both shared and mutable. Erlang enforces this because within a process it has an (almost) purely functional model. All variables are immutable, with just one exception: the process dictionary. This is a simple map that is associated with the process and can be modified, but not shared (keys and values in it can be shared, but they are immutable).


Don't Miss These Related Articles Also by David Chisnall

Learn more about David Chisnall


Message Passing

An Erlang process is typically implemented as a tail-recursive function that waits for a message, processes it, and then calls itself. Message sending in Erlang is asynchronous and buffered. This means that you avoid some of the (already small) cost of context switching between them by sending several messages between context switches.

The receive statement in Erlang receives the first message matching a particular pattern. There are generally two ways of getting messages. The first, to serve requests in order, is to do a receive with no pattern and then have a case statement, or do a receive with a number of different patterns to match the received message and handle it. The second method, which allows you to have some notion of priority, is to get the first message matching a specific pattern; then if there isn't one, fall through to getting one that matches another pattern.

This can be somewhat cumbersome at times. The Go model of allocating channels to send messages down is cleaner in some ways, although in Go the fact that you can have multiple threads holding references to the receiving end of a channel can cause other problems.

Bit Manipulation

One of the strengths of Erlang when it comes to network operations is the binary type. As with other primitive types in Erlang (atoms, numbers, lists and tuples), you can pattern match on binaries. This makes packet processing very fast. For example, if you have a TCP packet as a binary and you want to discard everything that wasn't sent to a specific destination port, you could filter like this:

DestPort = expected_destination(),
case next_packet() of
        <> ->
                handle_packet(SrcPort, Rest)
        ;
        _ -> false
end

If you've used Prolog, then you'll recognize that assignment in Erlang is a bit closer to Prolog's notion of unification than traditional assignment. Both SrcPort and DestPort are parts of the pattern that is matched, but if only DestPort is defined before entry into the case statement. This pattern-matching operation will assign a value to SrcPort from the binary if the binary has, as its second 16 bits (in big-endian format), the same value as DestPort.

This makes writing packet-processing code incredibly quick and easy. If none of the fields is defined, you can use this kind of pattern matching to simply decompose all the fields in a packet header. You can then independently pattern match on them, or use a dictionary mapping from port numbers to Erlang process IDs to forward them on for some additional processing.

Clustering

Scaling to single large servers is nice, and the hugely expensive 64-processor SGI box that gave me such a clear example of Erlang's strengths is vastly slower than a lot of cheaper machines we have in racks now, but sometimes a single machine just isn't fast enough. And then the only solution is to add more machines.

The asynchronous communication model in Erlang does a very good job of hiding latency within the system. If the time taken to deliver a message goes from nanoseconds on the local machine to milliseconds on another machine on the local network segment, a lot of code simply won't notice.

This is very different from code that uses shared memory, where the overhead of doing some kind of distributed shared memory involves a lot of network round trips to run something like a MESI protocol. It does still require some careful thought about how you will group the processes. Ideally, you want to keep the number of network round trips to a minimum, so you'll want to spawn processes that are used together on a single machine.

Erlang doesn't try to automatically place processes in a clustered system. It also doesn't migrate processes. It would sometimes be nice if it did, but the system was originally designed for telephone switches where jitter is even worse than latency. It is quite easy to write some generic code that will create processes on nodes in a round-robin fashion, add a little bit of load balancing, and even add some automatic forwarding for when you want migration. This is one of the first things that novice Erlang programmers typically implement. Shortly after that, they learn that the performance of such a system is typically much worse than stepping back and thinking about how the program is structured and distributing it accordingly.

Live Updates

As I said, Erlang was originally designed for telephone switches. These systems don't have downtime. Stopping the telephone network - or even part of it - for a software update is unthinkable. Erlang was therefore designed for live updates. The idea was to be able to deploy the new code, in parallel with the old code. New calls might be handled by the new code; old calls by the old code. Over a few hours, the system would gradually transition to only running the new code.

Existing processes can define their own transition time. When you load a new version of a module in a running Erlang system, there are two versions of it. When you refer to an unqualified name, you get the version that matches the version of the code that you're running. When you refer to a qualified name, you get the new version. I said earlier that Erlang processes are normally tail-recursive functions. The typical way for them to upgrade is to tail-call the new version of this function in response to a specific message. The priority system for messages allows them to do this when they have no other messages waiting or when they have no high-priority ones.

It still requires a little bit of careful design to allow this kind of upgrading, but it's quite possible to have a system that can upgrade in under a second with no downtime. It's also possible to use this system to allow for live testing. You can migrate a few connections over to the new code and check that it works before enabling all of them.

This kind of uptime isn't necessary for everyone. Most systems can handle a few seconds of service interruption as you kill the old server process and start a new one (although this is somewhat harder to do on a datacenter scale).

It's a shame that users have been trained to expect Internet services to just drop out periodically, and to know that they have to hit reconnect when they do. In a world with more Erlang deployed, this would be a lot rarer.

  • + Share This
  • 🔖 Save To Your Account