Some musings on DSL design
I’d like to say a few things language design and the DSLs I encounter “in the wild”. (Yes, a non-technical, non-Xtext-specific blog again: sorry ’bout that.)
First of all, about 90-95% of all the DSLs I encounter, e.g. on the Eclipse TMF Xtext forum, either deal with entity models (about 70% of the time) or are intrinsically “vertical” in nature. With “vertical” I mean that a certain aspect of the solution space (or implementation) is addressed, rather than an aspect coming from the problem (or business domain) space. Examples of non-entity vertical DSLs would be screen descriptions and (Spring) configurations. Of course, the entity model usually is a low-hanging fruit in a(ny) typical application development project to enjoy the benefits of formal modeling, so that doesn’t surprise me at all.
What does surprise me, however, is that:
- people are not actually creating more business-readable DSLs;
- people apparently are sticking to a C-like syntax a fairly large percentage of the time!
With C-like syntax I particularly mean: the use of semicolons (‘;’) to “separate” or “close” statements, and the use of commas to separate items in a list. DSLs often are a lot less complex than C or Java and have fewer degrees of freedom than full-blown GPLs, so it ought to be straightforward (for the parser) to determine where a construct starts and ends. In fact, I’ve never encountered a situation where I even could have used a token to separate or close constructs to get my (Xtext) grammar definition to work.
And, in the end, syntactically really complex languages like Groovy and Scala also manage to do without separating/closing ;’s. So, why bother your DSL users with having to type an extra character? There’s a reason that modern IDE’s like Eclipse have a feature to type that semicolon automatically for you. Maybe you want to avoid someone (possibly/especially yourself!) having to write that feature for your language?
It’s the same for the comma as a list item separator: lists typically are already delimited at the start and the end (“[...]“) so it’s completely clear for the parser that we’re in a list context. Inside lists, it’s even more unlikely that you need a closing token at all, if only because I’ve never seen lists with an ending comma. One big advantage of not having item separators is that you can easily shuffle list items around without having to take care of the commas as well. (This also saves some effort when generating DSL texts.)
This preference might also go a little way in explaining my first point of surprise, since DSLs with a C-syntax flavor tend to evoke reactions like “…but I don’t want to have to do anything with programming!” So, folks: could we please refrain from using unnecessary statement and list item separator tokens in our DSLs? *pretty please with sugar on top*