Home > Articles > Programming > Windows Programming

Interview with Andrei Alexandrescu (Part 3 of 3)

  • Print
  • + Share This
  • 💬 Discuss
Eric Niebler and Andrei Alexandrescu conclude their conversation about the D programming language by discussing concurrency, the complications of sharing data, dynamic loading, specification and licensing, and the future of D.

See Part 1 and Part 2 of this interview.

Eric: Every new language is driven by the problems of its time. Today, the big bugaboo is concurrency, and D brings a lot of new (and old) ideas to the table: message-passing, the shared type annotation, immutable data, synchronized classes, locks, and atomics. In your opinion, which of D's concurrency features will make the biggest difference to in-the-trenches programmers?

Andrei: The most important concurrency feature is definitely memory isolation between threads. In most of today's imperative languages, all threads hang out in the same bouillabaisse. The coder, the reviewer, and the compiler have no guarantee that some piece of data isn't shared. This is ironic because a good system is very frugal about data sharing across threads, so the all-shared assumption hurts everyone everywhere. In contrast, in D you can assume that data is not shared, except for small islands of data marked with the shared qualifier.

This setup changes a lot of things. On the face of it, default isolation is harsher than default sharing: Your threads are mostly like operating system processes, just without the inherent overhead. So threads are forced to communicate through highly structured mechanisms—most importantly message passing (MP). MP fosters coherent protocols, scales beautifully from threads to processes to networked machines, and is mostly immune to low-level sources of nondeterminism such as data races. (I mean, how often do you have concurrency troubles when assembling pipes of UNIX commands?)

If you want to share data across threads, you can do that by using the shared qualifier, but be warned—by intent it's limited to uses that are provable by the compiler to be free of data races. For now, the type system is rather conservative, but we hope that sharing is rare enough to be covered with a limited palette of patterns.

So D is turning sharing on its head—by default, threads share nothing and communicate via messages. Switching from share-by-default languages involves a learning curve, but I believe that the benefits are significant and lasting.

For those interested in the full story, Addison Wesley Longman and InformIT have kindly provided the full chapter on concurrency from my book The D Programming Language.

Eric: I'm particularly interested in the shared type annotation. Data is not shared across threads by default (interesting!) and—if I understand correctly—only atomic operations are allowed on shared data. To enforce higher-level invariants, locks must be still be used and the shared annotation cast away. What, then, is the use of shared?

Andrei: D's shared is an improved version of Java's volatile and C++'s atomic. All of these tell the compiler that the data in question is being accessed by several threads at once. Consequently, the compiler tiptoes whenever manipulating such data. The details depend on the language, but in essence the compiler forces predictable access to such data—load and store instructions are not moved around or optimized away. Also, the compiler inserts hardware handshakes (memory barriers) that ensure hardware-level visibility and sequencing constraints.

D's shared differs in two ways from other languages' similar offerings. First, its absence is as important as its presence: Data that isn't tagged with the shared qualifier is guaranteed not to be shared. In Java or C++, all data could be shared, and the data is handled properly only if you remember to tack on volatile or atomic.

This is a good improvement in the state of affairs. Many errors in threaded programs are caused by undue sharing or by paradoxical interactions between shared and unshared data. A correctly typed D program has no undue sharing: If anything is shared, it bears the shared qualifier, so the programmer and the compiler both know about it.

Second, D's shared is transitive. Consider for example a shared pointer to an integer, a type that in D you spell shared(int*). Now, given that several threads could access the pointer, is it reasonable to assume the actual int pointed to by that pointer is not shared? No, sir. All a thread has to do to get to the int is dereference the pointer. So we have this interesting fact—if you can reach some data starting from shared data, then the reached data must also be shared.

The shared property is transitive over the points-to relation! And that happens for all indirect accesses—pointers, array elements, class references. Once something is shared, bang! Midas' touch ensues—the entire transitively reachable subgraph of your program's object graph is painted shared. If shared weren't transitive, many of its uses wouldn't have been provably correct.

That design avoids low-level data races, and it's a good programming model for lock-free programming (which relies on shared and the compare-and-swap primitive, which D defines as a compiler intrinsic). But of course programs need higher-level invariants, and here's where you could use old-school synchronized methods. A synchronized method peels the outer layer of shared off the direct members of an object (but not the indirectly accessed memory—that would break shared's guarantees).

Sounds cool, eh? Well, not so fast. Operations on shared data are limited, and the rules are pretty harsh. To many D programmers coming from other languages, the system feels constraining at first. But I'm not worried at all:

  • It's not as if lock-based programming has an amazing track record of correctness and scalability. It's very comforting to know that getting things to compile really is getting things to work.
  • D fosters message-passing first, lock-based/lock-free programming second, and unchecked, cast-driven hacking a distant third.

Of course, many details are involved, so I'm glad that Addison Wesley Longman and InformIT have made the chapter on concurrency available for free.

Eric: D has great support for modules. Can D modules be loaded dynamically like DLLs or SOs?

Andrei: Well, you just walked into the messy attic during the open house visit. Come back downstairs, have some wine—look, hardware floors!

We're not up to snuff with dynamic loading; until recently it's been difficult to keep the "to do" plate in balance. Only in the last month or so has dynamic loading finally started to get more attention. Currently D offers decent function-level support for Windows DLLs—linking with stub libraries and full dynamic loading. UNIX shared objects need more work before we get there. But, at the moment, D doesn't do dynamic reflection such as loading of a class and inspecting its members. I'm confident that we'll get it done, however. D has quite strong generational capabilities, so the standard library can take care of the boilerplate code for the dynamic libraries and their clients without changing the language and without aggravating the programmer. It will take some time, though. At the moment, Walter Bright's first priority is to finalize the 64-bit native compiler, after which he plans to focus on dynamic loading.

Eric: Is D still evolving, and is there a specification? What would you say to a shop that's considering doing some development in D, but worries about language instability and lack of support?

Andrei: For now, The D Programming Language is the closest thing we have to a specification, and we're really careful to keep the language in sync with the book. Yet of course D is evolving, and hopefully will continue to evolve just like any programming language that people use. However, Walter and I had decided a long time ago that the D book will mark the end of a Cambrian period in the development of the language, which started as D1 and then off-sprung the D2 branch. From here on, we're talking about backwardly compatible changes.

The schism between D2 and D1 was a painful but ultimately winning decision. D1 is a nice language, but it's too conventional—it doesn't kick enough butt to make it big. (I hope there's no pun in there.) As a fundamental example, I don't see how D1 could have defined shared in a backwardly compatible manner that's also useful. That pretty much fizzles any interesting concurrency effort for D1. D2 is a different hat; it innovates so strongly and in such vital areas—concurrency, safety, generic programming—that it can stand on merit alone to challenge established languages.

So now the book is out, and D's evolution has taken a turn. First, I'm seeing the community growing—people are getting enthusiastic and want to help. Second, we focused our attention on stability and streamlining—cleaning that attic, you know. There's no incompatible D3 in the foreseeable future; D2 is here to stay.

What would I say to a shop considering adopting D? Stability is coming, support is there. Yet clearly there's a risk and an upfront investment. For example, early adopters would bear the brunt of defining necessary libraries such as database interfaces, web server plug-ins, Thrift bindings, and whatnot. But there's also a huge opportunity—we're talking about a car, not a better horse.

To give you just an example, at my workplace we use C++ for some of our core systems. Although we modularize carefully, building anything is horrendously slow. I touched one C++ source (not header!) file just now and rebuilt—78 seconds. That's more than enough to disrupt your groove. I've read a study somewhere—people's very mental processes slow down when they deal with slow interactive systems. Hey, I don't want to get even dumber over here!

Then there's the Go language, which compiles much faster than C++. (They even have a brief video about that fact on Go's website.) And then comes D, which compiles 4.5 times faster than Go. So we don't need a video—a still picture would suffice. Overall we're talking about more than two orders of magnitude improvement over C++ compilation times. That's not just better; it's a complete game changer. It literally makes you start thinking differently.

Eric: What is your sense of D uptake in the industry? Is there demand for D programmers?

Andrei: There's no surge of demand for D yet, but I hadn't expected any at this point. We built it; they will come. The book is a key term in the equation, and it just arrived. Allow me to predict that a few corporations will start using D by the end of 2010, on the heels of enthusiasts who use it for fun. Again, D offers formidable strategic advantages that surpass the initial tactical inconveniences—not to mention that it makes hacking pretty darn fun. I truly think that D2 has bloomed into an extremely compelling language, and almost everyone who reads the book gets excited—you, too, I reckon, unless you were just being nice.

Walter Bright and many others (including me) have invested a lot over the past years in making D a great language; and, really, as far as D2 is concerned, it's been "A Series of Fortunate Events"—a streak of correct decisions on difficult issues regarding concurrency, safety, corner cases in generic programming, and more. I really don't know of another language that manages to do the right thing so often without compromising in any important dimension.

I should add that some of my coworkers at Facebook got interested in D (I bribed them with free copies of my book) and got the green light to use D during an internal "hackathon" and for another small project. For Facebook's engineers and many others, the availability of a D binding for the Thrift framework is important (for example, all Facebook software services use Thrift). I'm cautiously optimistic that some enterprising Facebook-er will soon find the time to implement such a binding and enable D for large scale service-based architectures. Time will tell; it'll be interesting to see what will happen in the next few months.

Eric: What license does the Digital Mars D compiler use? Is an open-source implementation of D available?

Andrei: This is the proverbial moment when I open my mouth and remove all doubt that I'm a fool. I'm not versed in licensing matters; I like sharing my code, but I routinely forget to insert licensing comments. Many worried people asked me about the licensing of ScopeGuard (because it had no license comment at all), and Loki's license (MIT) is the result of a five-minute search for the most permissive license I could find at the time.

Walter knows much more than I do about licensing, but we both share a tendency toward openness. He has licensed the compiler's front end under both GPL and Artistic licenses. The front-end source is distributed along with the compiler distribution. The back end is not open-sourced for the simple reason that Walter has licensed it from his previous employer.

The standard library, which contains a lot of my code, uses the Boost license, which according to many people ranks as the most hassle-free, permissive, and corporate-vetted license. I'm sure you know more about that than I do, because you wrote a lot of Boost-licensed code!

Two other compiler projects are currently active: LDC uses the LLVM back end and the BSD license, and a recently resurrected GCC front end called GDC obviously uses GNU GPL.

Eric: Ah, but what state are they in? Do they implement D1 or D2? Is the LLVM effort using Walter's front end? (That sounds like a winner to me!)

Andrei: All projects I mentioned use the open-sourced reference front end to implement both D1 and D2, and trail behind the reference compiler by a few minor releases.

Eric: How are you liking it at Facebook?

Andrei: I love it! I work with great hackers on difficult problems and enjoy every minute of it. Facebook has managed to amass some amazing talent; I've never seen a recruiting department as innovative and active as Facebook's. They're really good at finding diamonds in the rough. Also, the interview process is taken very seriously and continuously streamlined. It's hard to get into Facebook—I'm talking Foreign Legion hard.

So on one hand you have these problems that almost nobody has had before. Have you heard? Facebook just reached 500 million active users! And, come to think of it, the site has virtually no static content. It's not heavy pounding against an HTML homepage—it's a half-billion distinct home pages, each dynamically generated! Then, on the other hand, you have this group of very talented, dedicated, and loosely organized people who keep on thinking of ways to make the site better. The net result is sheer awesomeness. Get this—I've seen beautiful PHP code. (Beats the hell out of seeing dead people, doesn't it?)

One great thing about working here is the breadth of exposure, the combination between theory and practice. You'd think there's little more to Facebook than web scripting, but there's some really heavy stuff going on—machine learning, natural language processing, graph theory, statistics, auction theory—it's all part of a normal day at work around here. I swear there's some neural network somewhere. One day that site will become sentient.

Anyway, I could go on, but let me not overstay my welcome. Speaking of which, Eric, I'd like to thank you and our kind editor, Jennifer Bortel, for having me and for letting me run my mouth for so long. My only excuse to InformIT's readers is this: After having worked with Walter and others on D in relative obscurity, I'd accumulated quite a lot of pent-up excitement.

Eric: Finally, let me say that I thoroughly enjoyed The D Programming Language. Best wishes to you and Sanda and Andrew, and thanks for bringing your ideas to D!

  • + Share This
  • 🔖 Save To Your Account

Discussions

comments powered by Disqus