Archive

Archive for the ‘MDSD’ Category

A trick for speeding up Xtend building

September 24, 2012 Leave a comment

I love Xtend and use it as much as possible. For code bases which are completely under my control, I use it for everything that’s not an interface or something that really needs to have inner classes and such.

As much as I love Xtend, the performance of the compilation (or “transpilation”) to Java source is not quite on the level of the JDK’s Java compiler. That’s quite impossible given the amount of effort that has gone into the Java compiler accumulated over the years and the fact that the team behind Xtend is only a few FTE (because they have to take care of Xtext as well). Nevertheless, things can get out of hand relatively quickly and leave you with a workspace which needs several minutes to fully build  and already tens of seconds for an incremental build, triggered by a change in one Xtend file.

This performance (or lack thereof) for incremental builds is usually caused by a lot of Xtend source  interdependencies. Xtend is an Xtext DSL and, as such, is aware of the fact that a change in on file can make it necessary for another file to be reconsidered for compilation as well. However, Xtend’s incremental build implementation is not (yet?) always capable of deciding when this is the case and when not, so it chooses to add all depending Xtend files to the build and so forth – a learned word for this is “transitive build behavior”.

A simple solution is to program against interfaces. You’ve probably already heard this as a best practice before and outside of the context of Xtend, so it already has merits outside of compiler performance. In essence, the trick is to extract a Java interface from an Xtend class, “demote” that Xtend class to an implementation of that interface and use dependency injection to inject an instance of the Xtend implementation class. This works because the Java interface “insulates” the implementation from its clients, so when you change the implementation, but not the interface, Xtend doesn’t trigger re-transpilation of Xtend client classes. Usually, only the Xtend implementation class is re-transpiled.

In the following I’ll assume that we’re running inside a Guice container, so that the Xtend class is never instantiated explicitly – this is typical for generators and model extensions, anyway. Perform the following steps:

  1. Rename the Xtend class to reflect it’s the implementation class, by renaming both the file and the class declaration itself, without using the Rename Refactoring. This will break compilation for all the clients.
  2. Find the transpiled Java class corresponding to the Xtend class in the xtend-gen/ folder. This is easiest through the Ctrl/Cmd-Shift-T (Open Type) short cut.
  3. Invoke the Extract Interface Refactoring on that one and extract it into a Java interface in the same package, but with the original name of the Xtend class.
  4. Have the Xtend implementation class implement the Java interface. Compilation should become unbroken at this point.
  5. Add a Guice annotation to the Java interface:
    @com.google.inject.ImplementedBy(...class literal for the Xtext implementation class...)
    

Personally, I like to rename the Xtend implementation class to have the Impl postfix. If I have more Xtend classes together, I tend to bundle them up into a impl sub package.

Of course, every time the effective interface of the implementation class changes, you’ll have to adapt the corresponding interface as well – prompted by compilation errors popping up. I tend to apply this technique only as soon as the build times become a hindrance.

Advertisements
Categories: Xtend(2) Tags: , ,

A (slightly) better switch statement in JavaScript

September 8, 2012 2 comments

The switch statement in JavaScript suffers from the usual problems associated with C-style switch statements: fall through. This means that each case guard needs to be expressly closed with a break statement to avoid falling through to the first executable code after that – no matter which case that code belongs to. Fall through has been the source of very many bugs. Unfortunately, the static code analysis for JavaScript (JSLint, JSHint and Google’s Closure compiler) do not check for potential fall through (yet?).

Today I thought I could improve the switch statement slightly with the following code pattern:

var result = (function(it) {
switch(it) {
case 'x': return 1;
case 'y': return 2;
/* ... */
default: return 0;
}
})(my_it);

(Apologies for the lack of indentation: couldn’t get that to work…)

The advantage of using return statements is two-fold:

  1. it exits the switch statement immediately,
  2. it usually comes right after the case guard, making visual inspection and verification much easier than hunting for a break either in- or outside of a nice pair of curly braces.

This approach also has a definite functional programming flavor, as we’ve effectively turned the switch statement into an expression, since the switch statement is executed as part of a function invocation.

Postscript

Yes, I do write JavaScript from time to time. I usually don’t like the experience very much, mostly because of inadequate tool support and the lack of static typing (and the combination thereof: e.g. the JS plug-ins for Eclipse often have a hard time making sense of the code at all). But we do what we can to get by 😉

Groovy-type builders and JSON initializers in Xtend

September 3, 2012 Leave a comment

One of the nicer features of dynamic languages like Ruby, Groovy, etc. is the possibility to easily implement builders which are constructs to build up tree-like structures in a very succinct a syntactically noise free way. You can find some Groovy examples here – have a special look at the HTML example. Earlier, Sven Efftinge has written a blog on the implementing the same type of builders using Xtend. My blog post will expand a little on his post by providing the actual code and another example.

The main reason that builders come naturally in dynamic languages is that metaprogramming allows adding “keywords” to the language without the need to actually define them in the form of functions. In the case of HTML, these “keywords” are the tag names. Statically-typed languages like Java or Xtend do not have that luxury (or “luxury” as the construct can easily be misused) so we’ll have to do a little extra.

The HTML example

You’ll find Sven’s original example reproduced in HtmlDocumentExample . Note that because of the point mentioned above, we need to have compile-time representations of the HTML DOM elements we’re using. Sven has written these manually but I’m afraid that I’m lazy to do that so I opted for a generative approach. Apart from that, the example works exactly the same, so I’ll refer to his original blog for the magic details – some of which I’ll re-iterate for the JSON example below.

Generation

Some basic HTML DOM element types are provided by the BaseDomElements file. Note two nice features of Xtend 2.3+: one file can hold multiple Xtend classes and the use of the @Data annotation as a convenient way to define POJOs – or should those be called POXOs? 😉 The POJOs for the other DOM elements are generated by the GenerateDomInfrastructure main class: note they are generated as POXOs which are then transpiled into Java. After running the GenerateDomInfrastructure main class and refreshing the Eclipse project, the  HelpDocumentExample class should compile.

A JSON example

You can find the JSON example in JsonResponseExample. For convenience and effect, I’ll reproduce it here:

class JsonResponseExample {

    @Inject extension JsonBuilder

    def example() {
        object(
            "dev"        => true,
            "myArray"    => array("foo", "bar"),
            "nested"     => object("answer" => 42)
        )
    }

}

To be able to compile this example, you’ll need my fork of Douglas Crockford’s Java JSON library with the main differences being that it’s wrapped as an Eclipse plug-in/OSGi bundle and it’s (as properly as manageable) generified. In addition to the GitHub repo, you can also directly download the JAR file.

As with the HTML example, the magic resides in the line which has a JsonBuilder Xtend class Guice-injected as an extension, meaning that you can use the functions defined in that class without needing to explicitly refer to it. This resembles a static import but without the functions/methods needing to be static themselves. The JsonBuilder class has two factory functions: object(..) builds a JSONObject from key-value pairs and array(..) builds a JSONArray from the given objects.

The fun part lies in the overloading of the binary => operator by means of  the operator_doubleArrow(..) function which simply returns a Pair object suitable for consumption by the object(..) factory function. This allows us to use the “key => value” syntax demonstrated in the example. Note that the => operator has no pre-existing meaning in Xtend – the makers of Xtend have been kind enough to provide hooks for a number of such “user-definable” operators: see the documentation.

I’m still trying to find a nice occasion to use the so-called “spaceship” or “Elvis” operators: now that would be positively…groovy 😉

Postscript

I failed to notice earlier that Xtend already has an -> operator which does exactly the same thing that our => operator does. Also, the “Elvis” ?: operator already has a meaning: “x ?: y” returns x if it’s non-null and y otherwise. This makes for a convenient way to set up default values. You’ll find both overloading definitions in the org.eclipse.xtext.xbase.lib.ObjectExtensions class.

Categories: Xtend(2)

Polymorphic dispatch in Xtend

August 3, 2012 1 comment

Polymorphic dispatch (or http://en.wikipedia.org/wiki/Multiple_dispatch or multimethods as its also called) is a programming language construct which chooses a code path based on runtime types instead of types that are inferred at compile. The poor man’s method of achieving such behavior would be to litter your code with prose like this:

if( x instanceof TypeA ) { (x as TypeA).exprA }
else if( x instanceof TypeB ) { (x as TypeB).exprB }
else ...

Xtend 2.x offers two much better constructs to do polymorphic dispatching:

  1. Through the use of the dispatch modifier for function defs – this construct is Xtend’s “official” polymorphic dispatch.
  2. Through the use of the switch statement and referring to types instead of cases.

These are better than the poor man’s method because they are declarative, i.e.: they express intent much more clearly and succinctly. Though both constructs have a lot in common, there are some marked differences and Best Practices for safe guarding type safety which I’ll discuss in the blog.

The example

Consider the following Xtend code – note that the syntax coloring is lacking a bit, but WordPress doesn’t fully understand Xtend – …yet…. Also note that since Xtend2.3 you can have more than one Xtend class in one file.

class CommonSuperType { ... }
class TypeA extends CommonSuperType { ... }
class TypeB extends CommonSuperType { ... }
class TypeC extends CommonSuperType { ... }
class UnrelatedType { ... }
class Handler {

  def dispatch foo(TypeA it) { it.exprA }
  def dispatch foo(TypeB it) { it.exprB }

  def bar(CommonSuperType it) {
    switch it {
      TypeA: it.exprA
      TypeB: it.exprB
    }
  }
}

For simplicity’s sake, let’s assume that exprA and exprB both return an int. The Xtend compiler generates two public methods in the Java class Handler – one for foo, one for bar. Both of these have the same signature: int f(CommonSuperType), where f = foo or bar. In addition, for each foo dispatch function, Xtend generates a public method with signature int _foo(t), where t=TypeA or TypeB – note the prefixed underscore. The actual polymorphic dispatch then happens in “combined” foo(CommonSuperType) method, actually through the previously demonstrated poor man’s method.

By the way: a user-friendly way to inspect the “combined” method is the Outline which will group the dispatch functions belonging together under the combined signature.

Note that the foo and bar method ends up with CommonSuperType as the type of its parameter. This is because CommonSuperType is the most specific common super type of TypeA and TypeB – deftly implied by the name – and Xtend infers that as the parameter’s type for the “combined” method. In general, Xtend will compute the most specific common super type across all dispatch functions, on a per-argument basis. In case of the bar method we declared ourselves what the parameter type is.

As demonstration, add the following code to the Handler class and see what happens:

  def dispatch foo(TypeC it) { it.exprC }

(Assume that exprC again returns an int.)

The generated foo and bar methods are functionally nearly identical, the difference being that foo explicitly throws an IllegalArgumentException mentioning the unhandled parameter type(s) in its message, in case you called it with something that is a CommonSuperType but neither of TypeA nor of TypeB. The bar method does no such thing and simply falls through the switch, returning the appropriate default value: typically null but 0 in our int-case. To remedy that, you’ll have to add a default case which throws a similar exception, like so:

  def bar(CommonSuperType it) {
    switch it {
      TypeA: it.exprA
      TypeB: it.exprB
      default:
        throw new IllegalArgumentException("don't how to handle sub type " + it.^class.simpleName)
    }
  }

In case you already have sensible default case, you’re basically out of luck.

Potential mistakes

Both approaches have their respective (dis-)advantages which I’ll list comprehensively below. In both cases, though, it’s relatively easy to make programmers’ mistakes. The most common and obvious ones are:

  1. The parameter type of the “combined” foo method is inferred, so if you add a dispatch function having a parameter type which does not extend CommonSuperType, then the foo method will wind up with a more general parameter type – potentially Object. This means that the foo method will accept a lot more types than usually intended and failing miserably (through a thrown IllegalArgumentException) on most of them. This is especially dangerous for public (which happens to be the default visibility!) function defs.
  2. Xtend will not warn you at editor/compile time about the “missing” case TypeC: it’s a sub type of CommonSuperType but not of TypeA nor of TypeB. At runtime, the bar method will simply fall through and return 0.
  3. The return type of the combined method is also inferred as the most specific common super type of the various return types – again, potentially Object. This is usually much less of a problem because that inferred type is checked against the parameter type of clients of the combined method.

This shows that these constructs require us to do a little extra to safe guard the type safety we so appreciate in Xtend.

Advantages and disadvantages of both constructs

We list some advantages and disadvantages of both constructs. Advantages of the dispatch construct:

  • Provides more visual code space. This is useful if the handling of the separate types typically needs more than 1 line of code.
  • Explicit handling of unhandled cases at runtime.

Disadvantages of the dispatch construct:

  • Automagically infers parameter types of the “combined” method as the most common super types. In case of a programmer error, this may be (much) too wide.
  • Takes up more visual code space/more syntactic noise.

Advantages of the switch construct:

  • Takes up less visual code space. This is useful if the handling of the separate types doesn’t need more than one line of code.
  • It’s a single expression, so you can use it as such inside the function it’s living in. Also, you can precompute “stuff” that’s useful for more than one case.

Disadvantages of the switch construct:

  • Fall-through of unhandled cases at runtime, resulting in an (often) non-sensical return value. You have to add an explicit default case to detect fall-through.

Mixing polymorphic and ordinary dispatch

Since Xtend version 2.3, you are warned about dispatch functions having a compatible signature as a non-dispatch function and vice versa. As an example, consider the following addition to the Handler class:

  def foo(TypeC it) { it.exprC  }
  def foo(UnrelatedType it) { it.someExpr }

Here, exprC again returns an int, but someExpr may return anything. Note that both functions are not of the dispatch persuasion.

The first line is flagged with the warning “Dispatch method has same name and number of parameters as non-dispatch method”, which is a just warning in my book. However, this warning is also given for the second line, as well as for the first two foo functions. (Note that the warnings are also given with only one of these extra functions present.) In that case, it’s not always a helpful warning but it does riddle your code file with warnings.

To get rid of the warnings, I frequently make use of the following technique:

  • “Hide” all dispatch functions by giving them (and only these) an alternate name. My personal preference is to postfix the name with an underscore, since the extra _ it’s visually inconspicuous enough to not dilute the intended meaning. Also, give them private visibility to prevent prying eyes.
  • Create an additional function with the same signature as the “combined” method for the dispatch functions, calling those.

The net result is that you get rid of the warnings, because there’s no more mixture of dispatch and non-dispatch functions with compatible signatures. Another upshot is that the signature of the “combined” method is now explicitly checked by the additional function calling it – more type safety, yeah! Of course, a disadvantage is that you need an extra function but that typically only is one line of code.

In the context of our example, the original two foo functions are replaced by the following code:

  def foo(CommonSuperType it) { foo_ }

  def private dispatch foo_(TypeA it) { it.exprA }
  def private dispatch foo_(TypeB it) { it.exprB }
Categories: The How, Xtend(2)

Language Workbench Challenge 2012

March 31, 2012 Leave a comment

Apologies for the radio silence: it’s been way too long since I’ve done some blogging.

In my defense, the last couple of months have been pretty busy: I’ve been working a lot on my language workbench (my own startup) and I also got heavily involved with another startup. Last week I was in Cambridge, UK for the Code Generation 2012 conference and the co-located Language Workbench Challenge during which I presented said language workbench for the first time – more on that in a minute.

As it turns out, the next couple of months are going to be slightly less busy and should allow me to write some more blogs. I got some material pent up already that’s good to go after a little spit-shine.

By the way: I was thrilled to notice that views have not really dropped all that much during the hiatus since my last blog post! 🙂

Language Workbench Challenge 2012

The #lwc2012 is an event that co-located with the Code Generation conference and takes places a day before that. The essence of the #lwc2012 -at least: to me- is to challenge the various language workbench creators with new ideas, levels of maturity, etc.. The sheer variance among the “contenders” is so big that it’s quite impossible to judge them on any objective and/or quantitative scale – this explains why the nomer Competition would be quite unjust. For more information on the event itself, I’m going to refer to the official site. This will probably be updated soon by organizers Angelo Hulshout and Paul Zenden with a nice summary of the event and possibly even videos of the various presentations (including mine).

My primary goal was to gather feedback on my ideas around and implementation of Más which a Cloud-based domain language workbench that makes creation of domain-specific languages and using these to model “stuff”. The feedback I got during the challenge was quite positive, in general. I already knew that UI and the editing behavior still leaves much to be desired but I was positively surprised by the fact that people other than myself were able to use it to do parts of the extension assignment. The fact that you can do graphical modeling with nothing more than regular HTML, CSS, Javascript plus a bit of HTML5 Canvas seemed to surprise plenty of people.

On the whole, the assignment -creating a modeling environment for the Piping & Instrumentation domain, including code generation and preferably doing or triggering simulation- itself was somewhat cumbersome for several reasons.

First of all, the reference implementation made use of a rather old-fashioned piece of proprietary Windows software which didn’t really provide a very clear for the code generation and triggering of the simulation. To get around that, I simply took the MetaEdit+ implementation which the nice people of MetaCase were good enough to share with the world at large, ran their code generation against an equivalent model and re-implemented that. I didn’t bother with the Windows thing beyond that.

Secondly, the two “domain experts” (or at least, the two people most knowledgeable on the domain, being Paul Zenden and Juha-Pekka Tolvanen) on site were also two challengers. Especially the extension assignment could have benefited from a clarification by unbiased domain stakeholders. Angelo and I have already exchanged some ideas on how to do that differently next year – in particular: it would be nice to be able to consult real, on-site domain stakeholders which would be available during the preparation of the assignment as well. The extension assignment also didn’t really address the workbenches’ capability to really extend the language.

Overall, the event was quite inspiring to me and has provided me with encouragement to continue with the development of Más as well as with a couple of ideas I didn’t have beforehand. Stay tuned for more on that in the future 🙂

Categories: DSLs, MDSD

Xtext tip: “synthetic” parser rules

December 22, 2011 2 comments

This is a quick one to share a simple trick which may come in handy when creating an Xtext grammar.

Let’s say your grammar has a type rule T1 (i.e., a rule which corresponds to an EClass in the Ecore meta model). Let’s also say that some other type rule T2 composes that type somehow, i.e., it has a feature someT1 to which something of type T1 is assigned. Let’s say that you want to limit the syntactic possibilities for the composition of a T1 instance somewhat, e.g. in the case that T1 is a group of alternatives but a few alternatives are invalid when used inside T2.

This is a wholly legitimate situation because Xtext grammars usually have a number of responsibilities at the same time, amongst which are defining (1) a mapping to an Ecore meta model and defining (2) the syntax of the DSL.

Let’s sum this situation up in some grammar code:

T1: A1 | A2 | A3;

T2: 'a-t2' someT1=T1;

Let’s say that we would want to exclude A3 from the possible T1‘s in any T2. We could do this via a validation which simply checks the someT1 feature of any T2, reporting an error if it’s an A3. But that means that the parser itself still allows an A3 at that spot which could open up a whole can of smelly worms – e.g., left-recursion or some ambiguity. Also, the content assist that comes out-of-the-box will make syntax suggestions for A3.

Hence, we would like to inform the parser about the restricted syntax. One possibility would be:

T1: T1WithoutA3 | A3;

T1WithoutA3: A1 | A2;

T2: 'a-t2' someT1=T1WithoutA3;

This works perfectly, but it also ‘pollutes’ the meta model a bit. Since the meta model is mostly consumed by downstream clients like interpreters and code generators, this would only cause confusion. But more importantly: if we re-use an existing Ecore meta model (by means of the returns clause) this solution is not possible, since we would have to add a super type T1WithoutA3 to the A1 and A2 types which are sealed inside the re-used Ecore meta model – Xtext will issue an error as soon as we try it.

The clean solution consists of using something which I’ve termed a “synthetic parser rule” and has the following form:

T1:  A1 | A2 | A3;

T1WithoutA3 returns T1: A1 | A2;

T2: 'a-t2' someT1=T1WithoutA3;

Now there’s no pollution of the meta model, but the syntax will be restricted as we’d like it. Note that is very much something which is part of the standard Xtext repertoire but this trick works especially well in the face of type hierarchies and re-used Ecore meta model or inheriting Xtext grammars.

Using syntactic predicates in Xtext, part 2

December 20, 2011 2 comments

This blog is a continuation of the previous one about how to use syntactic predicates in Xtext. As promised, I’ll provide a few more examples, most of which come from the realm of GPL-like languages.

But first, a little summary is in order. As stated in the previous blog, a syntactic predicate is an annotation in an Xtext grammar which indicates to the ANTLR parser generator how a (potential) ambiguity should be resolved by picking the (first) one which is decorated with ‘=>‘. The annotation can be applied to:

  • a(n individual) keyword (such as ‘else‘),
  • a rule call (unassigned or as part of an assignment) and
  • a grouped parse expression, i.e. a parse expression between parentheses.

One thing to keep in mind -not only for syntactic predicates but in general- that an Xtext grammar has at least three and often four responsibilities:

  1. defining the lexing behavior through definition and inclusion of terminals;
  2. defining the parsing behavior through parser rules which determine how tokens are matched and consumed;
  3. defining how the model is populated;
  4. (when not using an existing Ecore model) defining the meta model.

Syntactic predicates influence the second of these but not the others. It is, after all, a syntactic predicate, not a semantic one – which Xtext doesn’t have in any case. Just as without using syntactic predicates, parsing behavior is not influenced by how the model is populated: instead, it is governed solely by the types of the tokens it receives from the lexer. This is easily forgotten when you’re trying to write grammars with cross-references like this:

SomeParserRule: Alternative1 | Alternative2;
Alternative1: ref1=[ReferencedType1|ID];
Alternative1: ref2=[ReferencedType2|ID];

In this case, the parser will always consume the ID token as part of Alternative1 even if its value is the (qualified) name of something of ReferencedType2. In fact, ANTLR will issue a warning about alternative 2 being unreachable so it is disabled. For a workaround this problem, see this older blog: it uses a slightly different use case as motivation but the details are the same. The only thing a syntactic predicate can do here is to explicitly favor one alternative over the other.

Some examples from Xbase

The Xtend and the Xbase languages that Xtext ships with both use plenty of syntactic predicates to avoid ambiguities in their grammars and to avoid having to use backtracking altogether. This already indicates that syntactic predicates are a necessary tool, especially when creating GPL-like or otherwise quite expressive DSLs. Note again that syntactic predicates are typically found near/inside optional parts of grammar rules since optionality automatically implies an alternative parsing route.

A good example can be found in the Xbase grammar in the form of the XReturnExpression rule: see GitHub. It uses a syntactic predicate on an assignment to force the optional XExpression following the ‘return‘ keyword to be parsed as part of the XReturnExpression rather than being an XExpression all on its own – which would have totally different semantics, but could be a viable interpretation considering Xtend doesn’t require separating/ending semi-colons.

The Xbase grammar also shows that syntactic predicates are an effective way to disambiguate the use of pairs of parentheses for denoting a list of arguments to a function call from that for grouping inside an expression: once again, see GitHub – here, the syntactic predicate applies to a grouped parse expression, i.e. everything between the parentheses pair starting just before the ‘=>‘.

Unforeseen consequences

Even if you don’t (have to) use syntactic predicates yourself, it’s important to know of their existence. As an example, the other day I was prototyping a DSL which used the JvmTypeReference type rule from Xbase followed by an angled bracket pair (‘<‘, ‘>’) which held ID tokens functioning as cross-references. I was momentarily surprised to see parse errors arise in my example along the lines of “Couldn't resolve reference to JvmType 'administrator'.” The stuff between the angled brackets was being interpreted as a generic type parameter!

It turns out that the  JvmTypeReference parser rule uses a syntactic predicate on an angled bracket pair surrounding generic type parameters. This explains both the behavior and the lack of warnings by ANTLR about grammar ambiguities. You’d probably have a hard time figuring out this behavior before finding an innocuous ‘=>here. In the end, I changed “my” angled brackets to square brackets to resolve this. This shows that syntactic predicates, just like backtracking, can be a double-edged sword: it can solve some of your problems but you have to really know how it works to be able to understand what’s going on.

I hope that this was useful for you: please let me know whether it is! I’m not planning on a third installment but you never know: a particular enticing use case might just do the trick.