An Interview with Martin Fowler and Rebecca Parsons on Domain-Specific Languages
Also see a sample chapter from the book, An Introductory Example.
Neal: I happen to know that you've been working on this book for a long time. Why so long from concept to publication?
Martin: I often wonder about that myself. Partly it may be due to the topic. There’s a lot to cover so we can provide enough breadth to show people the range of alternatives and depth to get people started. There was also a long period where I was just looking at existing DSLs and trying out ideas. My speaking and traveling schedule has been busy too, and I may not have as much energy as I did when I was younger. The sad news is that it seems to be getting harder rather than easier to write books.
Rebecca: Making things "simple" takes a lot longer than simply understanding them, and the purpose was to take this mass of programming language processing tools from something only compiler writers could understand to something that regular programmers concerned with things other than language implementations could understand.
Martin: The enjoyable part of writing this book was looking at lots of examples out there and figuring out how to put that material into a cohesive structure. None of the techniques in the book are new, but it still takes time to organize them in a way to make it easier to understand them.
Neal: Is it really useful to have developers write their own languages? Why isn't one of the multitude of general purpose languages enough?
Rebecca: One of the major problems in software development is the ineffective communication of requirements from users to developers. While I believe that true ubiquitous end user programming is a long way off, I do think we can improve the level of readability of at least parts of programs. Sufficient increases in the readability of aspects of programs make communication with the end user about what the program does much more effective. General purpose programming languages are designed for developers, not for marketing people, or biologists, or investment bankers. The problem domains that programs address all have their own terminology. We can improve communication by narrowing the gap between the language of the problem and the language of the solution.
Neal: Isn't this another one of those academic subjects like semantic web and artificial intelligence that comes up now and again but never really finds a good practical use?
Martin: One of the things that makes it hard to gauge how useful DSLs are is that fact that until now there’s been no decent source that people can use to understand the techniques you need to build them. It reminds me very much of the position of Refactoring before I wrote that book—some people knew what to do, but there wasn’t anywhere you could point to as a coherent source of information. My hope is that this book will provide this source for DSLs, people will find it easier to go off and try the techniques, and we’ll learn better how useful they really are.
Rebecca: It is true that we've heard—literally for decades—about domain specific languages. I believe two issues have precipitated much of the failure: the goal of user writable and the notion that the whole program needed to be written in the DSL. Business people know the business domain. Developers know software development. Trying to turn all business people into developers is not going to happen anytime soon; there is, after all, a reason most people study for at least a few years before really beginning to be productive as a developers. Right now, we simply don't yet have the tools and techniques to make software development "simple" enough for people to just "pick up" like they do a word processor, in addition to having to know their problem domain.
Martin: Indeed I don’t think that non programmers will ever be able to just pick up general purpose programming. I think to program well requires a different mind-set and practised skills. The trick is to find ways that non-programmers can most effectively collaborate with programmers.
Rebecca: The second issue led to language creep. Many languages started nice and compact and focused on the domain. Then, since the whole (or at least most of the) program needed to be written in this language, we needed conditionals, and then we needed iteration, and then we needed abstraction and soon you had a general purpose, turing complete programming language with a few domain concepts embedded in it. We then have programs that are just slightly less incomprehensible to people who know the problem domain.
Martin: This is the key point of what makes DSLs special—the fact that they are of limited expressiveness. So, to use them you combine several DSLs with a General-Purpose Language.
Neal: Do you have to understand things like language grammars and parsing to understand this subject?
Rebecca: Being a programming languages person, my answer might be a bit biased. However, I feel you need to understand language grammars, perhaps not in their full theoretical glory. A language grammar helps communicate what is legal in the language. It also provides the structure to allow the interpretation of the meaning of a fragment in the language. Whether you specify the grammar in full BNF or it is simply implied, the grammar has to exist. Grammar specification for programming languages provides a straightforward way to discuss what is a legal program. Understanding of parsing—at least understand how parsing happens or why parsing works—is not necessary. The book provides examples of some simple parser implementation techniques that end up being quite useful for DSLs, which often are relatively simple from a formal languages perspective. Using internal DSLs also takes away the need to understand parsing, although you'll still need to understand how the implementation language works well enough to get your internal DSL implementation to work as you intend.
Neal: Won't encouraging developers to write their own languages lead to cacophony?
Martin: This is a very common objection, but to think about this point clearly you have to remember the relationship between DSLs and frameworks. In general DSLs are nothing more than a thin facade over an underlying framework or library. Whenever developers find repetition in their code, they should abstract it into some common code. They then manipulate that abstraction using a normal API, which we term a command-query API in the book. Using a DSL is a decision that a command-query API isn’t the best way of working with that abstraction, so you layer a DSL on top to make it easier to use. With or without a DSL, you still have to understand the underlying abstraction.
Rebecca: Clearly any tool can be misused. However, used properly, you'll have a close correspondence between the issues being faced by the system—both business and technical—and the DSLs that get created. Since the cognitive distance between concepts in the domain and constructs in the DSL should be smaller than when using a general purpose language, overall understanding of the working of the system should improve. No more concepts are introduced using DSL than already exist in the systems. The DSL simply provides a more readable description of the system behavior.
Neal: Is this subject really only applicable to people who use Ruby, Scala, and Groovy?
Rebecca: Not at all. Internal DSLs are easier to do in languages like Ruby and Groovy, but DSLs can co-exist quite happily with other languages. In those cases, you'll be more likely to be using an External DSL.
Martin: That’s partly why most of the examples in the book are in Java and C#. The point is that the techniques are general ones that can be used with any language. A common theme in my writing is to find general techniques that you can learn once and apply to many languages—and this topic is no exception.
Neal: What would you view as the best outcome from having published your book in terms of industry impact or behavior changes?
Rebecca: Many people who only learned about parsing in a compiler course think language processing is hard. The tools are much simpler than they are often presented, as long as the language being processed is simple enough. Implementing a full general purpose programming language is far harder than implementing something like the state machine in the book, and yet state machines are quite powerful tools. My goal is for people to no longer view these tools as something only those strange language freak types like me can use.
Martin: I want people to know the full range of techniques that you can use with a DSL, so they can make better choices should they consider going down this path. I think this will lead to more DSLs being used, and an improvement in how DSLs are implemented. The interesting question is how broad an impact this will have. At this point I think lack of knowledge of these techniques obscures the broader questions of their impact, so by removing this I hope we can see better what can be done with them.
Neal: Can you reconcile the concept of polyglot programming (using several general purpose languages within the same project but hosted on the same virtual machine) with DSLs? Are these concepts orthogonal or complementary?
Martin: They are different concepts in that DSLs are not general-purpose languages.
Rebecca: Mostly orthogonal I would say, but there is a relationship. Polyglot programming's premise is that one selects the right language for each of the parts of the system and then weaves these parts together using the underlying virtual machine. Using a DSL is about designing the right language for communicating a particular concept and then allowing that implementation to work together with the rest of the system written in one or more general purpose or domain specific languages. The similarity is in choosing the right language for each part of the system, rather than choosing a language that is acceptable for the whole system.
Martin: So they are different concepts, but built on the same premise. There is no One Best Language. Languages are tools so we need to use the right tool for the job. This principle leads to multiple general-purpose languages (polyglot programming) and to using DSLs.
Neal: Are average developers good enough to write DSLs?
Martin: Like so many questions you can replace ‘DSL’ with ‘library’. Are average developers good enough to write libraries?
Rebecca: I think there are two issues to separate here—language design and language implementation. I think an average developer can implement a DSL as described in the book with no more difficulty than the rest of the problem. I think the choice of implementation will vary based on the level of expertise available to understand the implementation. For example, some Ruby internal DSL implementations can be difficult to understand. Using an External DSL in that case might aid in understanding.
Martin: I think the hardest part is coming up with the abstraction, which usually manifests itself in the form of a library. That’s a fundamental part of programming. Any programmer has to learn how to do this, and only half of programmers do it better than average.
Rebecca: Designing a good DSL is a different issue. We don't really address the issue of good language design in the book—we couldn't cover everything. However, I think there's still a lot of research and study needed to come up with guidance for how to design a good DSL. I suspect the guidelines might also vary depending on the target user.
Martin: If the book is successful we’ll see many more cruddy DSLs—Sturgeon's Law always applies—but we’ll also see some more good DSLs. The good DSLs will provide the value that will make it all worthwhile.
Neal: What would you have a developer tell a manager to convince them to allow some of these techniques?
Rebecca: Two major benefits come to mind that apply in different situations. The first involves decreasing overall implementation time (when one includes user testing) by increasing the effectiveness of the communication of requirements. If the problem has been properly communicated, the probability that the system does "the right thing" increases significantly, and the problem of having to rewrite software simply because the requirement was misunderstood decreases dramatically.
Martin: This is the brass ring of DSLs—the ability to open up a deep collaboration between programmers and domain experts. Often this is expressed as saying domain experts will write programs in DSLs. My sense is that won’t be the case most of the time and that the more realistic point is where domain experts can read the DSLs. Readability is enough to unlock the key communication benefits.
Rebecca: The second benefit involves longer term maintenance. A system that is easier to understand is easier to maintain and evolve. Complex business logic buried in code is hard to understand. Putting that complex business logic in a language suited to it makes the job of understanding what the system really does simpler.
Martin: Consider regular expressions—they may be cryptic and often hard to follow, but they are much easier to understand than what the alternative code would look like.