Xtext tip: “synthetic” parser rules
This is a quick one to share a simple trick which may come in handy when creating an Xtext grammar.
Let’s say your grammar has a type rule T1 (i.e., a rule which corresponds to an EClass in the Ecore meta model). Let’s also say that some other type rule T2 composes that type somehow, i.e., it has a feature someT1 to which something of type T1 is assigned. Let’s say that you want to limit the syntactic possibilities for the composition of a T1 instance somewhat, e.g. in the case that T1 is a group of alternatives but a few alternatives are invalid when used inside T2.
This is a wholly legitimate situation because Xtext grammars usually have a number of responsibilities at the same time, amongst which are defining (1) a mapping to an Ecore meta model and defining (2) the syntax of the DSL.
Let’s sum this situation up in some grammar code:
T1: A1 | A2 | A3; T2: 'a-t2' someT1=T1;
Let’s say that we would want to exclude A3 from the possible T1‘s in any T2. We could do this via a validation which simply checks the someT1 feature of any T2, reporting an error if it’s an A3. But that means that the parser itself still allows an A3 at that spot which could open up a whole can of smelly worms – e.g., left-recursion or some ambiguity. Also, the content assist that comes out-of-the-box will make syntax suggestions for A3.
Hence, we would like to inform the parser about the restricted syntax. One possibility would be:
T1: T1WithoutA3 | A3; T1WithoutA3: A1 | A2; T2: 'a-t2' someT1=T1WithoutA3;
This works perfectly, but it also ‘pollutes’ the meta model a bit. Since the meta model is mostly consumed by downstream clients like interpreters and code generators, this would only cause confusion. But more importantly: if we re-use an existing Ecore meta model (by means of the returns clause) this solution is not possible, since we would have to add a super type T1WithoutA3 to the A1 and A2 types which are sealed inside the re-used Ecore meta model – Xtext will issue an error as soon as we try it.
The clean solution consists of using something which I’ve termed a “synthetic parser rule” and has the following form:
T1: A1 | A2 | A3; T1WithoutA3 returns T1: A1 | A2; T2: 'a-t2' someT1=T1WithoutA3;
Now there’s no pollution of the meta model, but the syntax will be restricted as we’d like it. Note that is very much something which is part of the standard Xtext repertoire but this trick works especially well in the face of type hierarchies and re-used Ecore meta model or inheriting Xtext grammars.