Home > DSLs, MDSD, SD in general, The How, The Why > DSLs “versus” software engineering

DSLs “versus” software engineering

(New Year’s resolution: blog more frequently ;))

One aspect of using DSLs for software development which seems to be a bit underplayed (IMHO) is the role of good-ole software engineering, by which I happen to mean the practice of creating software through a systematic approach involving analysis, design, implementation and testing in a controlled and predictable manner. It seems to me there’s something of a sentiment or expectation that with DSLs you don’t have to do software engineering anymore, as if everything that makes creating software difficult somehow instantly disappears when using DSLs to model the software to as large a degree as is desirable and productive.

There are two main reasons for using DSLs:

  1. Empowering domain stakeholders (and other participants) by establishing an ubiquitous language for communication and providing them (and disciplines downstream) with dedicated tooling for that language;
  2. Separating the essential complexity (the what and why) from the incidental complexity (the how) by focusing the language on the former and hiding the latter “somewhere else”. (This also means that the software model does the same with less “code”.)

So, how does software engineering come into play then? Well, as I see it, there are two sides to this equation: (1) the DSL itself and the way it’s used and (2) the “meta software”, i.e. the software (parser, editor, tooling, etc.) which brings the DSL to life.

Engineering a DSL

To me, the fundamental value of software engineering is the set of concepts such as Separation of Concerns, Loose Coupling, Strong Coherence, (Unit) Testing etc., which allow us to create quality software. I simply define software quality somewhat non-conventially as “it works as required plus it can be productively changed in case the requirements change”. (It’s interesting to see that Kent Beck defines the concepts Loose Coupling and Strong Coherence in terms of how change spreads through a code base.) A DSL can easily be said to “work” if the two advantages mentioned above are realized: stakeholder empowerment and separating essential from incidental complexity -essentially another incarnation of Separation of Concerns. Unfortunately, it’s almost impossible to make this S.M.A.R.T. so you’ll have to rely on your craftsmanship here.

The most obvious aspect of DSL design which contributes directly to the change part of quality is: modularization. This aspect has two directions: (1) how do you distribute different aspects across parts of the language and (2) how can you cut up the entire domain into manageable pieces. Both of these directions benefit directly from the application of concepts like Separation of Concerns, Loose Coupling and Strong Coherence. As an example, consider a (vertical) DSL for Web application development which would typically address data, screen and interaction modeling: Do you create a separate DSL for the data model? Can you divide that up as well? Do you separate screens and interaction/flow? Do you take the use case as unit of granularity? Etcetera… The answers to all these questions are “It depends…” and you’ll have to address these time and time again for each situation or change to that.

But software engineering on the DSL side doesn’t stop there: the DSL instance, i.e., the model that’s built using the DSL must be viewed as software as well -after all, it’s the center piece for the software creation. E.g., as soon as you can modularize your model, you’ll have to think about how to divide the DSL instance into several pieces. Questions which come into play here: How large should each piece be? What kind of inter-piece dependencies do I want to allow or minimize? (This already depends on how modularization is realized on the language level.) How does the versioning system you use affect these decisions? Again, Separation of Concerns, Loose Coupling and Strong Coherence are key concepts to keep in mind here.

You also might want to think about a way to (unit) test the instance. A famous example is business rules: it is very valuable to be able to test the execution of such rules in different scenarios to validate that the results are what your domain stakeholders are expecting. How you code such tests depend (as ever) on the situation: sometimes it’s better to code them against an the business rules’ execution engine (which is something different than testing that execution engine itself!), sometimes you enhance the DSL (or probably better: create a separate DSL) for this purpose.

Engineering the meta software

By “meta software” I mean all the software which is involved with the DSL and which is not directly contributing to the realization of requirements/business value. This ranges from parser definitions (when using a Parser Generator), parser implementations, model validators and model importers to code generators or model interpreters/execution engines. It’s important to realize that this software in the traditional sense as well -not just a bunch of “utility scripts” lying around. In fact, your meta software has to be at least as good as “regular” software since it typically has a large impact on the rest of the software development because the model does more in the same amount of “code”. Among other things, this warrants that you create automated tests for all components, that these tests are part of the continuous integration (automated build) and that everything is checked in into a versioning system. It also warrants that you design the meta software really well, e.g. with an eye towards Separation of Concerns, both inside components as well as across components. It’s advisable to choose a set of tools/frameworks which integrate well and offer as much IDE and compile-time support as possible, to make meta software development as productive, error- and care-free as possible. (I once did a project in which an UML model was fed directly into a combination of a dynamic language and StringTemplate: in the end you get used to it, but it’s painful all the way…)

If the DSL changes, then it might happen that the current model (DSL instance) breaks and must be repaired -depending on the amount of breakage you could do that manually or expend the effort to devise an automated migration facility. This also means that it should be clear at all times which version of the DSL a model requires: it’s usually best to explicitly version both the DSL as well as the model but you might be able get away with an incremental push from a DSL development branch to the modeling branch. In the latter case, you really need to make sure that you tag the meta software together with releases of the software in order to be able to go back in history reliably.

Using a DSL has a kind of a two- (or even multi-) step nature: the DSL instance is parsed first and then fed to either a code generator or a model interpreter. So, if you change the DSL you’ll probably have to change the second step as well. In fact, changes typically “flow backwards”: changes to or enhancement of the code generator or model interpreter often require a change to the DSL as well, e.g. in the form of a new language construct. This phenomenon is sometimes called coupled evolution. Again, having automated tests for all individual components/steps are really necessary to avoid running into problems. In case you use a model interpreter, it’s usually quite easy to write (unit) tests against that.

Change “versus” code generation

In case you have a code generator things are typically more difficult because (1) the generation often is target language-agnostic since most template engines (such as StringTemplate and Xpand) simply spit out “plain text” and (2) usually there’s non-generated code (either framework code or code behind the Generation Gap) as well which has to be integrated with the generated code. I’ve found that in these cases it’s extremely useful to use a relatively small reference application for the meta software discipline. Such a target application would consists of a smallish DSL instance/model which does, however, touch all language constructs in the DSL (especially the new ones) and a fair amount of combinations of those and non-generated code consisting of framework code, hand-crafted code behind the Generation Gap and –mui importante– unit tests to validate the reference application. As always, the process of generating from the reference model, the building of the reference application from generated + non-generated code and running of the unit tests should be automated to be able to verify the correctness of the meta software reliably and quickly.

Something which is very useful in this scenario, is traceability, i.e. being able to see where generated code came from. In particular, which template produced a particular piece of code and what model element was the primary input for that template. Realizing a modest but already quite useful form of traceability is to generate comments with tracing information along with the real code. This is reminiscent of logging, also because care must be taken that the tracing information is succinct without overly “littering” the actual code.

Wrapping up

I hope I’ve been able to make the case that “old-fashioned” software engineering has a definite merit in the DSL scenario. Of course, there’s a lot more to say on this topic. Personally, I’m hoping that the upcoming book on DSL Engineering by Eelco Visser and Markus Voelter treats this really well.

  1. Flavia
    January 13, 2011 at 2:01 pm

    Nice blog Meinte. I think one of the main reasons why DSLs are frowned upon is because it causes people to come out of their comfort zones: functional specifiers from their word documents and developers from their coding (copy, paste most of the time). As long as people are comfortable in their roles, they don’t feel the need to transform to a ‘better’ or a ‘structured’ way of doing things. I call this ‘inertia’ 🙂
    But a crisis or a need to cut-down costs would be the right time to adopt such methods.

  2. James Roome
    January 13, 2011 at 10:10 pm

    Having played with Xtext/xPand and MPS, the one thing that makes MPS stand out IMMHO is the ability not to ‘spit out text’ in MPS. In MPS you are going from model->model->model->model as far as you want. And of course proving the capability of building a model by combining other models.

    • January 14, 2011 at 7:35 am

      Xtext/Xpand has this capability as well, in the form of the (OCL-like) Xtend model transformation language. The thing is: having a template engine which is target language-aware is really neat but it shouldn’t be an obstacle. E.g., I’d be hard-pressed to transform towards a Java model instance/AST instead of writing a template because the AST is so far removed from what we normally see and also because the model transformation code is bound to be extremely verbose. (This is actually one of the 3 major shortcomings of the official OMG’s MDA standard.)

      So, I’m inclined to use a target language-agnostic template engine and have all the other things (like continuous build and unit tests) in place -you need those anyhow since you can’t fully guarantee that something actually compiles, much less that it’s really semantically correct, using a target language-aware template engine.

      However, there are tools which give you a template engine which is both target language-aware and not an obstruction. The two I know of are Spoofax and the Intentional Domain Workbench and I’d image that MPS has a similar construct. In each case, you do need to devise a complete grammar/domain for the target language in all its gory details to make this useful, which is quite an effort so this would primarily work for mainstream languages.

  3. Dave Orme
    January 13, 2011 at 10:30 pm

    I like your blog article but would add a third reason for DSLs:

    Sometimes there’s an impedance mismatch between the programming language and the problem domain. Such an impedance mismatch is really a form of incidental complexity. Creating a DSL can resolve the impedance mismatch by factoring the incidental complexity into the DSL. The resulting API / “language” can be much cleaner, clearer, and more expressive of the programmer’s intent.

    Here is on example of this sort of DSL using SWT, RCP, and Scala:



    Dave Orme

    • January 14, 2011 at 7:22 am

      You’re right, although I feel this really is the 2nd reason as well since it’s about getting rid of incidental complexity as much as possible. The impedance mismatch you mention would actually be a very good (blog) topic on its own, especially because Model-Driven Software Development provides a nice way to deal with that: model transformations.

      I like your slide deck, both because of the subject and the fact it’s not PowerPoint. Some syntax highlighting for the Java and Scala sources would help make your case even better, IMHO.

  4. April 19, 2014 at 11:53 am

    After I initially left a comment I seem to have clicked on the -Notify me when new comments
    are added- checkbox and from now on every time a comment is added
    I recieve four emails with the same comment. Perhaps there is a means you can remove me from that service?

    • April 19, 2014 at 1:38 pm

      Unfortunately, I can’t see how I can do that for you.

  1. January 13, 2011 at 2:51 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: