Home > Articles > Programming

What Language I Use for… Building Scalable Servers: Erlang

  • Print
  • + Share This
David Chisnall tells you why his language of choice for building scalable servers is Erlang.
Like this article? We recommend

I first felt the real power of Erlang during my PhD, when I wrote a server for remote paging with aggressive prefetching on my (single-core) laptop and then deployed it on a 64-processor SGI machine and saw it scale smoothly up to the available processors.

It's quite easy to write code in C that has a handful of threads. It's even easy to have a few dozen threads if they're largely independent. It's incredibly hard to write C code that has a few thousand threads, all communicating closely, and without insane overheads from contention when you try scaling it out beyond about eight cores.

Erlang is the only language where I not only have managed to write code that scaled like this but also where I have the code that did it and worked the first time.

Erlang Processes

The basic unit of Erlang code is a process. These are very lightweight. Allocating a process in Erlang takes about the same amount of overhead as allocating an object in another language. You can spawn very short-lived processes for asynchronous events and then let them exit on completion, or you can manage pools of threads that distribute work and spawn more as required.

Erlang processes are not the same as processes in the host operating system. They are scheduled by the Erlang VM, in the same way that Java threads are not necessarily threads in the host operating system. They're called processes, not threads, because Erlang has a shared-nothing model.

As I've said before, if you want to write scalable, maintainable, parallel code, there is one rule that you must abide by: No data may be both shared and mutable. Erlang enforces this because within a process it has an (almost) purely functional model. All variables are immutable, with just one exception: the process dictionary. This is a simple map that is associated with the process and can be modified, but not shared (keys and values in it can be shared, but they are immutable).


Don't Miss These Related Articles Also by David Chisnall

Learn more about David Chisnall


Message Passing

An Erlang process is typically implemented as a tail-recursive function that waits for a message, processes it, and then calls itself. Message sending in Erlang is asynchronous and buffered. This means that you avoid some of the (already small) cost of context switching between them by sending several messages between context switches.

The receive statement in Erlang receives the first message matching a particular pattern. There are generally two ways of getting messages. The first, to serve requests in order, is to do a receive with no pattern and then have a case statement, or do a receive with a number of different patterns to match the received message and handle it. The second method, which allows you to have some notion of priority, is to get the first message matching a specific pattern; then if there isn't one, fall through to getting one that matches another pattern.

This can be somewhat cumbersome at times. The Go model of allocating channels to send messages down is cleaner in some ways, although in Go the fact that you can have multiple threads holding references to the receiving end of a channel can cause other problems.

Bit Manipulation

One of the strengths of Erlang when it comes to network operations is the binary type. As with other primitive types in Erlang (atoms, numbers, lists and tuples), you can pattern match on binaries. This makes packet processing very fast. For example, if you have a TCP packet as a binary and you want to discard everything that wasn't sent to a specific destination port, you could filter like this:

DestPort = expected_destination(),
case next_packet() of
        <> ->
                handle_packet(SrcPort, Rest)
        ;
        _ -> false
end

If you've used Prolog, then you'll recognize that assignment in Erlang is a bit closer to Prolog's notion of unification than traditional assignment. Both SrcPort and DestPort are parts of the pattern that is matched, but if only DestPort is defined before entry into the case statement. This pattern-matching operation will assign a value to SrcPort from the binary if the binary has, as its second 16 bits (in big-endian format), the same value as DestPort.

This makes writing packet-processing code incredibly quick and easy. If none of the fields is defined, you can use this kind of pattern matching to simply decompose all the fields in a packet header. You can then independently pattern match on them, or use a dictionary mapping from port numbers to Erlang process IDs to forward them on for some additional processing.

Clustering

Scaling to single large servers is nice, and the hugely expensive 64-processor SGI box that gave me such a clear example of Erlang's strengths is vastly slower than a lot of cheaper machines we have in racks now, but sometimes a single machine just isn't fast enough. And then the only solution is to add more machines.

The asynchronous communication model in Erlang does a very good job of hiding latency within the system. If the time taken to deliver a message goes from nanoseconds on the local machine to milliseconds on another machine on the local network segment, a lot of code simply won't notice.

This is very different from code that uses shared memory, where the overhead of doing some kind of distributed shared memory involves a lot of network round trips to run something like a MESI protocol. It does still require some careful thought about how you will group the processes. Ideally, you want to keep the number of network round trips to a minimum, so you'll want to spawn processes that are used together on a single machine.

Erlang doesn't try to automatically place processes in a clustered system. It also doesn't migrate processes. It would sometimes be nice if it did, but the system was originally designed for telephone switches where jitter is even worse than latency. It is quite easy to write some generic code that will create processes on nodes in a round-robin fashion, add a little bit of load balancing, and even add some automatic forwarding for when you want migration. This is one of the first things that novice Erlang programmers typically implement. Shortly after that, they learn that the performance of such a system is typically much worse than stepping back and thinking about how the program is structured and distributing it accordingly.

Live Updates

As I said, Erlang was originally designed for telephone switches. These systems don't have downtime. Stopping the telephone network - or even part of it - for a software update is unthinkable. Erlang was therefore designed for live updates. The idea was to be able to deploy the new code, in parallel with the old code. New calls might be handled by the new code; old calls by the old code. Over a few hours, the system would gradually transition to only running the new code.

Existing processes can define their own transition time. When you load a new version of a module in a running Erlang system, there are two versions of it. When you refer to an unqualified name, you get the version that matches the version of the code that you're running. When you refer to a qualified name, you get the new version. I said earlier that Erlang processes are normally tail-recursive functions. The typical way for them to upgrade is to tail-call the new version of this function in response to a specific message. The priority system for messages allows them to do this when they have no other messages waiting or when they have no high-priority ones.

It still requires a little bit of careful design to allow this kind of upgrading, but it's quite possible to have a system that can upgrade in under a second with no downtime. It's also possible to use this system to allow for live testing. You can migrate a few connections over to the new code and check that it works before enabling all of them.

This kind of uptime isn't necessary for everyone. Most systems can handle a few seconds of service interruption as you kill the old server process and start a new one (although this is somewhat harder to do on a datacenter scale).

It's a shame that users have been trained to expect Internet services to just drop out periodically, and to know that they have to hit reconnect when they do. In a world with more Erlang deployed, this would be a lot rarer.

  • + Share This
  • 🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020