Archive

Posts Tagged ‘instantiation’

Multi-level modeling: what, why and how

October 4, 2014 2 comments

One of the arguably-classical problems of language engineering is literal notation. Basically, as soon as you can refer to (/name) types, you’d also want to be able to instantiate those types. As it so often happens, the nomer “classical problem” does not imply that is has been solved to some degree of satisfaction. Just look at Java which still lacks a good literal syntax for object and collection instantiation, even with a library like Google Guava. Only fairly recently have more main stream GPLs emerged which happened to have syntactically nice object and collection literals.

But in this blog I only will talk about multi-level modeling, which can be informally defined as being able to model “things” and to then also be able to model instances of those “things”. The multi-level aspect comes from the fact that instances of “things” live one meta level down from the “things” themselves. I will only consider 2-level modeling, i.e. we only “descend” one meta level. Arbitrarily-deeply nested multi-level modeling is certainly possible but is also a magnitude more difficult and for the purposes of our exposé not that relevant.

As an example, consider the archetypical data modeling DSL in which you can model entities having named features: attributes with a data type or references to other entities. It’s often handy to be able to instantiate those entities, e.g. as initialisation or test data, alongside the definition. (Admittedly, this particular example will only really start to liven up when you start referencing those entities in business logic.) This is entirely analogous to the GPL case. For an actual example, consider the following grammar of an Xtext implementation of this DSL.

Entity: 'entity' name=ID '{'
          features+=Feature*
        '}';

Feature: Attribute | Reference;

Attribute: name=ID ':' dataType=DataType;
enum DataType: string | integer;

Reference: name=ID '->' entity=[Entity];

One way to do provide some multi-level modeling capabilities is to include a concept EntityInstance in the abstract syntax, pointing to an Entity and containing FeatureValues which allow the user to put in literals or instances for each of the features. For the given Xtext example, this looks as follows:

EntityInstance:
  'instance-of' entity=[Entity] '{'
    values+=FeatureValue*
  '}';

FeatureValue: feature=[Feature] '=' value=Literal;

Literal:          AttributeLiteral | EntityInstance;
AttributeLiteral: StringLiteral | IntegerLiteral;
StringLiteral:    string=STRING;
IntegerLiteral:   integer=INT;

The main drawbacks to this approach are:

  • It provides one generic (i.e., uncustomisable) concrete syntax for literals.
  • It requires quite some constraints to get this to work correctly.

In our simple case, this last point boils down to:

  1. feature must be one of parent(EntityInstance).entity.features

  2. type(value) must match type of feature

  3. every feature may only receive one value

The implementation of the scoping (constraint 1) and the validations (constraints 2 and 3) is not too problematic, but this language happens to be tiny. In general, an implementation of a type system, scoping, validations, content assist, etc. tends to be superlinear in the grammar’s size, so we can only expect this to get worse.

Wouldn’t it be nicer if each occurrence of Entity would automagically and dynamically be transformed into a concept in the abstract syntax? In our case, if we start with

entity MarkdownDialect {
  ^name : string
  ^author : string
}

then this would be expanded into

MarkdownDialect:
  'markdown-dialect' '{'
      ('name' '=' name=STRING)
    & ('author' '=' authorName=STRING)
  '}';

We see two challenges with this: both the abstract syntax/meta model as well as the concrete syntax have to be expanded on-the-fly. The concrete syntax is by far the biggest problem. For a textual (i.e., parsed) situation it is difficult as it would mean extending the grammar according to some user-defined scheme and re-generating the parser and other artefacts. It’s not impossible as e.g. SugarJ proves, but it’s not exactly mainstream. For projectional editing, it tends to be easier as the concrete syntax then is typically more fluid and described in terms of simple templates, instead of grammars.

But to do this, we will need to be able to transform relevant parts of the DSL prose into “dynamic parts” of the abstract + concrete syntax. This is a matter of a fairly simple transformations -one for abstract, one for concrete- provided we don’t introduce the chance of infinite recursion. However, there’s a bit of a challenge with regards to the dynamic nature of the syntaxes: if entities are changed/removed, the corresponding instances become invalid. This means that the editor must be aware -at least to some extent- of the fact that this might happen and consequently shouldn’t break.