Home > DSLs, The How, The Why > Multi-level modeling: what, why and how

Multi-level modeling: what, why and how

One of the arguably-classical problems of language engineering is literal notation. Basically, as soon as you can refer to (/name) types, you’d also want to be able to instantiate those types. As it so often happens, the nomer “classical problem” does not imply that is has been solved to some degree of satisfaction. Just look at Java which still lacks a good literal syntax for object and collection instantiation, even with a library like Google Guava. Only fairly recently have more main stream GPLs emerged which happened to have syntactically nice object and collection literals.

But in this blog I only will talk about multi-level modeling, which can be informally defined as being able to model “things” and to then also be able to model instances of those “things”. The multi-level aspect comes from the fact that instances of “things” live one meta level down from the “things” themselves. I will only consider 2-level modeling, i.e. we only “descend” one meta level. Arbitrarily-deeply nested multi-level modeling is certainly possible but is also a magnitude more difficult and for the purposes of our exposé not that relevant.

As an example, consider the archetypical data modeling DSL in which you can model entities having named features: attributes with a data type or references to other entities. It’s often handy to be able to instantiate those entities, e.g. as initialisation or test data, alongside the definition. (Admittedly, this particular example will only really start to liven up when you start referencing those entities in business logic.) This is entirely analogous to the GPL case. For an actual example, consider the following grammar of an Xtext implementation of this DSL.

Entity: 'entity' name=ID '{'
          features+=Feature*
        '}';

Feature: Attribute | Reference;

Attribute: name=ID ':' dataType=DataType;
enum DataType: string | integer;

Reference: name=ID '->' entity=[Entity];

One way to do provide some multi-level modeling capabilities is to include a concept EntityInstance in the abstract syntax, pointing to an Entity and containing FeatureValues which allow the user to put in literals or instances for each of the features. For the given Xtext example, this looks as follows:

EntityInstance:
  'instance-of' entity=[Entity] '{'
    values+=FeatureValue*
  '}';

FeatureValue: feature=[Feature] '=' value=Literal;

Literal:          AttributeLiteral | EntityInstance;
AttributeLiteral: StringLiteral | IntegerLiteral;
StringLiteral:    string=STRING;
IntegerLiteral:   integer=INT;

The main drawbacks to this approach are:

  • It provides one generic (i.e., uncustomisable) concrete syntax for literals.
  • It requires quite some constraints to get this to work correctly.

In our simple case, this last point boils down to:

  1. feature must be one of parent(EntityInstance).entity.features

  2. type(value) must match type of feature

  3. every feature may only receive one value

The implementation of the scoping (constraint 1) and the validations (constraints 2 and 3) is not too problematic, but this language happens to be tiny. In general, an implementation of a type system, scoping, validations, content assist, etc. tends to be superlinear in the grammar’s size, so we can only expect this to get worse.

Wouldn’t it be nicer if each occurrence of Entity would automagically and dynamically be transformed into a concept in the abstract syntax? In our case, if we start with

entity MarkdownDialect {
  ^name : string
  ^author : string
}

then this would be expanded into

MarkdownDialect:
  'markdown-dialect' '{'
      ('name' '=' name=STRING)
    & ('author' '=' authorName=STRING)
  '}';

We see two challenges with this: both the abstract syntax/meta model as well as the concrete syntax have to be expanded on-the-fly. The concrete syntax is by far the biggest problem. For a textual (i.e., parsed) situation it is difficult as it would mean extending the grammar according to some user-defined scheme and re-generating the parser and other artefacts. It’s not impossible as e.g. SugarJ proves, but it’s not exactly mainstream. For projectional editing, it tends to be easier as the concrete syntax then is typically more fluid and described in terms of simple templates, instead of grammars.

But to do this, we will need to be able to transform relevant parts of the DSL prose into “dynamic parts” of the abstract + concrete syntax. This is a matter of a fairly simple transformations -one for abstract, one for concrete- provided we don’t introduce the chance of infinite recursion. However, there’s a bit of a challenge with regards to the dynamic nature of the syntaxes: if entities are changed/removed, the corresponding instances become invalid. This means that the editor must be aware -at least to some extent- of the fact that this might happen and consequently shouldn’t break.

  1. October 6, 2014 at 11:26 am

    Hi Meinte,

    i read your most recent blog entries about DSL definition, generic app langs, etc.

    It is not obvious for what you are heading for:

    Do you want to extend your favourite tool chain (Xtext, Xtend) to provide more capabilities in the sense of Xsemantics or the TU Delft SDL tool family?

    or

    Are you looking for the ONE language to decribe all concrete and abstract and tool/editor aspects of a language for professional and amateur language engineers?

    or

    Do you want provide some functionality in a DSL for DSL definition, which is already/easy provided on an abstract model level (in the sense of mega-modeling)?

    I do not understand your focus on the syntax fo a DSL for concrete syntax definition if you want to talk about multi-level-modeling and the generation of the DSL code based on the semantic or execution model.

    Regards,

    Jürgen

    • October 6, 2014 at 5:57 pm

      Hi Jürgen,

      This multi-level blog isn’t specifically related to the GAL but a little more in general about capabilities of language/modeling workbenches to support dynamic abstract as well as concrete syntax. The Xtext example is purely to demonstrate the idea and its peculiarities. I’m not intending to extend any of the tools you mention: Xtext would require a major rewrite, Xtend already effectively has object literal notation (through the => operator) and internal DSL capabilities, and I’m not working in any way with or on Spoofax.

      I’ve shied away from language workbenches a little: I’ve come to realize that most projects wouldn’t need their modeling languages (certainly not when you have templating and multi-level modeling in the GAL) and for those that do (to some extent), there are enough workbenches out there that are more than good enough.

      I think that concrete syntax plays a major role in multi-level modeling: it determines whether users can “click” with it. E.g., a major part of JetBrain’s activities w.r.t. MPS revolves around having a good concrete syntax and (editor) behaviour of it. Multi-level modeling does not obviate the need for having a good (or any, for that matter) concrete syntax.

      Regards,

      Meinte

  1. No trackbacks yet.

Leave a comment