Home > Articles > Programming > Visual Studio

  • Print
  • + Share This
This chapter is from the book

Textual DSLs

Before talking about graphical DSLs, let's look briefly at textual DSLs. We'll see how Domain-Specific Development involves a particular way of thinking about a problem, and we'll look at how to implement this approach using textual languages.

Imagine that we are designing a graphical modeling tool and have the problem of defining a set of shapes that will be displayed on a screen to represent the various concepts that can be depicted by the tool. One way we might do this would be to invent a new textual language for defining the various shapes. A fragment of this language might look like this:

Define AnnotationShape Rectangle
      Decorator Comment
      End Comment
End AnnotationShape

In order to process this language, a program must be written to parse and interpret this text. As a programming exercise from scratch, this is a big job. But a parser-generator might be used, which itself takes as input a description of the grammar of the new language, such as the following, based on BNF (the Backus Naur Form, originally developed for defining the Algol language):

Definitions ::= Definition*
      Definition ::= Define Id Shape
      Width Eq Number
      Height Eq Number
      FillColor Eq Color
      OutlineColor Eq Color
End Id

Shape ::= Rectangle | RoundedRectangle | Ellipse

Eq ::= "="

Decorator ::= Decorator Id
      Position Eq Position
End Id

Position ::= Center|
             TopLeft |
             TopRight |
             BottomLeft |

The definitions for Id, Number, and Color are not included here; it's assumed that they are built into the grammar-defining language.

We need an algorithm to convert this BNF into a parser for the language it describes. We'd either use an existing parser-generator such as Yacc, Bison, Antlr, or Happy, or an expert might write one by hand in a normal third-generation programming language such as C# or Java.

Notice that the BNF is itself a DSL. We might "bootstrap" the BNF language by describing its grammar in itself, causing it to generate a parser for itself. Perhaps the hand-written parser will be quite simple, and the generated parser would handle a more complicated version of BNF. This pattern of using languages to describe languages, and bootstrapping languages using themselves, is very common when defining domain-specific languages.

Implementing a textual DSL by implementing its grammar like this can be a difficult and error-prone task, requiring significant expertise in language design and the use of a parser-generator. Implementing a parser-generator is definitely an expert task, because a grammar might be ambiguous or inconsistent, or might require a long look-ahead to decide what to do. Furthermore, there is more to implementing a language than just implementing a parser. We'd really like an editor for the language that gives the kinds of facilities we expect from a programming language editor in a modern development environment, like text colorization, real-time syntax checking, and autocompletion. If you include these facilities, the task of implementing a textual language can get very large. Happily, there are alternative strategies for implementing a textual DSL that don't involve implementing a new grammar.

The first strategy is to use the facilities of a host language to emulate the capabilities of a domain-specific language. For example, the following C# code has the effect of defining the same shape as the previous example:

Shape AnnotationShape = new Shape(ShapeKind.Rectangle,
Decorator Comment = new Decorator(Position.Center);

This kind of code is often called configuration code, because it uses previously defined classes and structures to create a specific configuration of objects and data for the problem that you want to solve. In effect, the definitions of these classes and structures are creating an embedded DSL, and the configuration code is using that DSL. The capabilities of modern languages to define abstractions such as classes, structures, enumerations, and even configurable syntax make them more amenable to this approach than earlier languages that lacked these facilities.

The second strategy is to use XML—Extensible Markup Language. There are many ways in which the definition can be expressed using XML. Here's a possible approach.

<?xml version="1.0" encoding="utf-8" ?>
  <Shape name="AnnotationShape">
    <Decorator name="Comment">

The syntax is obviously limited to what can be done using XML elements and attributes. Nevertheless, the tags make it obvious what each element is intended to represent, and the meaning of the document is quite clear. One great advantage of using XML for this kind of purpose is the widespread availability of tools and libraries for processing XML documents.

If we want to use standard XML tools for processing shape definitions, the experience will be much improved if we create a schema that allows us to define rules for how shape definitions are represented in XML documents. There are several technologies available for defining such rules for XML documents, including XML Schema from the World Wide Web Consortium (defined at www.w3.org/XML/Schema.html), RELAX NG from the OASIS consortium (defined at www.relaxng.org) and Schematron, which has been accepted as a standard by the International Organization for Standardization (ISO) and is defined at www.schematron.com. Schematron is supported in .NET: A version called Schematron.NET is downloadable from SourceForge, and it is possible to combine the facilities of XML Schema and Schematron. We'll use here the XML Schema approach, which is also supported by the .NET framework.

An XML Schema is an XML document written in a special form that defines a grammar for other XML documents. So, using an appropriate schema, we can specify exactly which XML documents are valid shape definition documents. Modern XML editors, such as the one in Visual Studio 2005, can use the XML schema to drive the editing experience, providing the user with real-time checking of document validity, colorization of language elements, auto-completion of tags, and tips about the document's meaning when you hover above the elements.

Here is one of many possible XML schemas for validating shape definition documents such as the one presented earlier. Writing such schemas is something of an art; you'll certainly observe that it is significantly more complicated than the BNF that we defined earlier, although it expresses roughly the same set of concepts.

<?xml version="1.0" encoding="utf-8"?>
  <xs:element name="Shapes">
        <xs:element maxOccurs="unbounded" name="Shape">
              <xs:element name="Kind" type="kind" />
              <xs:element name="Width" type="xs:decimal" />
              <xs:element name="Height" type="xs:decimal" />
              <xs:element name="FillColor" type="xs:string" />
              <xs:element name="OutlineColor" type="xs:string" />
              <xs:element maxOccurs="unbounded" name="Decorator">
                    <xs:element name="Position" type="position" />
                  <xs:attribute name="name" type="xs:string" use="required" />
            <xs:attribute name="name" type="xs:string" use="required" />

  <xs:simpleType name="position">
    <xs:restriction base="xs:string">
      <xs:enumeration value="Center" />
      <xs:enumeration value="TopLeft" />
      <xs:enumeration value="TopRight" />
      <xs:enumeration value="BottomLeft" />
      <xs:enumeration value="BottomRight" />

  <xs:simpleType name="kind">
    <xs:restriction base="xs:string">
      <xs:enumeration value="Rectangle" />     
      <xs:enumeration value="RoundedRectangle" />
      <xs:enumeration value="Ellipse" />


To summarize, in this section we have looked at three ways of defining a textual DSL: using a parser-generator, writing configuration code embedded in a host language, and using XML with a schema to help validate your documents and provide syntax coloring and autocompletion. A further option would be to define an equivalent to the DSL Tools that targeted textual languages.

Each of these approaches has its pros and cons, but they all share a common theme—investing some resources early in order to define a language that will make it easier to solve specific problems later. This is the basic pattern that also applies to graphical DSLs, as we shall see.

The DSL Tools themselves provide no facilities in version 1 for defining textual domain-specific languages. The Tools' authors have taken the view that XML provides a sufficiently good approach to start with, and so they have designed the DSL Tools to integrate XML-based textual DSLs with graphical DSLs.

  • + Share This
  • 🔖 Save To Your Account