Checklist for Xtext DSL implementations
Currently I’m in the US, working with a team that’s building a number of DSLs with Xtext and have been doing that for some time already. The interesting thing is that this team is quite proficient at doing that and tackling all sorts of gnarly problems (coming either from a legacy language which they have to emulate to some level or from requirements coming from the end users), even though most of them have only been working with Xtext for a few months. However, during the past week I realized that I unconsciously use a collection of properties which I check/look for in Xtext DSLs and since I use it unconsciously I wasn’t really aware of the fact that not everyone was using the same thing. In effect, the team had already run into problems which they had solved either completely or partly in places which were downstream from the root causes of the problem. The root causes generally resided at the level of the grammar or scope provider implementation and would (for the most part) have been covered by my unconscious checklist. Had the team had my checklist, they’d probably saved both time and headaches.
Since existing sources (i.e., the Xtext User Guide and, e.g., Markus Völter’s “MD* Best Practices” paper) are either reference-typed or quite general and somewhat hard to easily map to the daily Xtext practice, I figured I’d better make this list explicit. I divvied the checklist up in three sections: one concerning the Generate<MyDsl>.mwe2 file, one concerning the grammar file and one concerning the Java artifacts which augment the grammar.
Generator workflow
- Do the EPackages imported in the grammar file correspond 1:1 with the referencedGenModels in the workflow?
- Do you know/understand what the configured fragments (especially pertaining to naming, scoping, validation) provide out-of-the-box?
- Is backtracking set to false (default) in the options configuration for the XtextAntlrGeneratorFragment? I find that backtracking is rarely needed and unless it is, enabling backtracking introduces quite a performance hit and. More importantly, it might hide ambiguities (i.e., they don’t get reported during the generation phase) in the grammar at a point you didn’t need the backtracking for anyway.
To expand a little on the second item, here’s a list of the most important choices you’ve got:
- naming: exporting.SimpleNamesFragment versus exporting.QualifiedNamesFragment
- scoping: scoping.ImportURIScopingFragment versus scoping.ImportNamespacesScopingFragment
- validation.JavaValidatorFragment has two composedChecks by default: ImportUriValidator which validates importURI occurrences (only useful in case you’ve configured the ImportURIGlobalScopeProvider in the runtime module, either manually or by using ImportURIScopingFragment), and NamesAreUniqueValidator (which checks whether all objects exported from the current Resource have unique qualified names).
Grammar
- Any left-recursion? This should be pretty obvious since Xtext generator breaks anyway and leaves the DSL projects in an unbuildable state.
- No ambiguities (red error messages coming from the ANTLR parser generator)? Ambiguities generally either come from ambiguities at the token level (e.g., having a choice ‘|’ which consume the same token type) or overlapping terminal rules (somewhat rarer since creating new terminal rules and/or changing existing ones is fortunately not that common).
- Does the grammar provide semantics which are not syntactical in nature? Generally: grammar is for syntax, the rest (scope provision, validation, name provision, etc.) is for semantics.
- Did you document the grammar by documenting the semantics of each of the rules, also specifying aspects such as naming, scoping, validation, formatting, etc. (unfortunately, in comment-form only)? Since the grammar is the starting point of the DSL implementation, it’s usually best to put as much info in there as possible…
- Did you add a {Foo} unassigned action to the rules which do not necessarily assign to the type? (Saves you from unexpected NullPointerExceptions.)
To expand on the second item pertaining to ambiguities:
- Most ambiguities of the first kind are introduced by incorrect setup of an expression sub language. Make sure you use the pattern(s) described in Sven‘s and two of my blog posts.
- Favor recursive over linear structures in the context of references into recursive structures. This makes implementing the scope provider all the more easier (or even: possible). For a example of this: see this blog post.
Java artifacts
First some checks which pertain to implementation of the custom local scope provider:
- Are you using the “narrow” form (signature: IScope scope_<Type>_<Reference>(<ContextType> context, EReference ref), where Reference is a feature of Type) as much as possible?
- Are you using the “wide” form (signature: IScope scope_<Type>(<ContextType> context, EReference ref)) where it makes sense?
- Have you chosen the ContentType (see previous item) to be convenient so you don’t need to travel up the containment hierarchy?
For the rest of the Java artifacts:
- Is your custom IQualifiedNameProvider implementation bound in the runtime module?
- Does the bound IQualifiedNameProvider implementation compute a qualified name for the model root? (Important in case you’re using the org.eclipse.xtext.mwe.Reader class.)
- Have you implemented value converters (see §5.7) for all the data type rules in the grammar?
- Have you bound the value converter class in the runtime module?
-
May 4, 2012 at 10:14 amDomain-Specific Languages: Links, News and Resources (1) « Angel ”Java” Lopez on Blog