Home > Uncategorized > The Xtext grammar language and its invisible infix operator

The Xtext grammar language and its invisible infix operator

The nice thing about the Xtext framework is that it eats its own dog food: its grammar definition language (also called Xtext of course: the framework suffers slightly from reuse) is created using Xtext -there’s a wondrous principle called bootstrapping at work here 😉

Anyway, this means you can simply inspect the .xtext file and the other Java customizations directly in the plugin in case you don’t believe the User Guide or find that it’s lacking. One such case (at least for me) was the construction of unordered groups and why the following grammar fragment didn’t produce the result I expected:

   transient?='transient'? & abstract?='abstract'? 'entity' name=ID;

The result I expected was to parse things like “transient entity Foo” and “abstract entity Bar”. It does that, but it doesn’t accept “abstract transient entity Foo” while it does accept “abstract entity Bar transient“, which certainly checked out with what I intended!

After looking into the grammar, I was a little surprised to find that the part of the language for defining parser rules actually uses an expression language for everything between the ‘:’ and ‘;’. (I shouldn’t have been, of course, since there’s an explicit reference to EBNF expressions and you can use parentheses to group what we now can call in all certainty, sub expressions.)

This expression language sports the following infix operators, ordered in increasing precedence:

  1. ‘|’ for the regular alternative operator,
  2. ‘&’ for the unordered alternative operator and
  3. concatenation of “abstract tokens”, meaning assignments, keywords, rule calls, actions and everything parenthesized, not separated by anything other than (optional!) whitespace.

Item 3 says that token concatenation is an invisible infix operator…spooky! So,

   transient?='transient'? & abstract?='abstract'? 'entity' name=ID

actually means the same as

   transient?='transient'? & ( abstract?='abstract'? 'entity' name=ID )

because the invisible token concatenation operator has higher precedence than the unordered group operator. This explains why “abstract entity Bar transient” is accepted (and yields the same AST as “transient abstract entity Bar”) but “abstract transient entity Foo” not (the abstract keyword is separated from the entity keyword). The fix is easy enough: just enclose the entire unordered group in parentheses, just like you would do a group of alternatives.

I also noticed that (lone) keywords and rule calls can have cardinality postfixes as well, at least syntax/grammar-wise -I haven’t checked what happens at and after generation and whether semantics are what you’d intuitively expect. It’s certainly something I haven’t seen used in any grammar so far!

Categories: Uncategorized
  1. Jan Koehnlein
    September 17, 2010 at 8:33 am

    Hi Meinte,

    nice blog entry. We actually we had a discussion in the Xtext Team on precedences in the Xtext grammar language and decided the one that we expect to be the most common to have the highest precedence, and that is a Group (without any operator). In general, it’s good practice to add parenthesis whenever you’re unsure. This also holds for other languages

  2. baerrach
    November 28, 2012 at 11:41 pm

    Thank you.

    Google found this answers for me while trying to find why me xtext unordered group was not working as I expected it, i.e. it didn’t allow unordered.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: