Home > Uncategorized > A pure-Java arithmetics parser

A pure-Java arithmetics parser

Well, it’s been over a month since the last post…I’ll try and post more often in the near future.

Anyway, one of the things I did in the past month was to write a parser for arithmetic expressions by hand. This was mainly inspired by Sven Efftinge’s blog and the workshop I gave which showed that explaining such a parser is actually quite hard. Sven’s movie which actually showed the call stack is rather more enlightening than any verbal account of the matter, so I started thinking about how to do the same thing but then interactively (and without the need to do a voice over which I don’t like to do since I hate hearing my own voice). In the end, I figured that Eclipse’s debugger already provides that functionality so a parser (and lexer) hand-written in Java should already do the trick.

The parser can be found alongside the Xtext workshop material I published earlier on Google Code. Remarks:

  • It is a regular Eclipse project which doesn’t use anything beyond a standard Java JRE (>=5) and JUnit4.
  • The project contains three parser classes: ParserWithLeftAssociativity, ParserWithSomeRightAssociativity and ParserWithSomeNonAssociativity -the correspondence (to Sven’s blog) should be obvious 😉
  • The parsers share a common lexer implementation and the two latter parsers derive from the first one both for convenience and for sake of clarity -again, the correspondence should be obvious.
  • I matched the parser implementation with the structure of the Xtext grammar definition and also (to a limited extent) with the structure of a generated Antlr-based parser. Comments refer to bits of the grammar which are being matched at that point in the code.
  • Arithmetic expression can contain identifiers (matching /[a-zA-Z]+/) which are essentially defined on-the-fly. I added this to make the lexer slightly more interesting.
  • I didn’t try and match the structure of the lexer to that of an Xtext-generated one. Instead, it’s rather free-form and doesn’t follow any particular lexer pattern I know (e.g., from the “Dragon book”): I’ve gone for a “whatever works and communicates” approach -I hope it communicates to you as well 😉
  • Error detection, reporting and recovery is basic…at most. (The parser behaves slightly better than the lexer.)
  • Unit tests are not too good (or actually not asserting anything for the lexer…).

To see the parser and its rule call stack, simply enable a breakpoint at Parser#ruleExpression and run a unit test in debug mode. Let me you know if you find bugs, have possible enhancements or think this is useful (or not!).

Categories: Uncategorized
  1. August 23, 2013 at 8:58 am

    If you want to create DSLs in pure java, you can give this tool a try: https://github.com/pfmiles/dropincc.java , There’s already a calculator example in the quick start. See the wiki pages for more info.

    • August 23, 2013 at 9:13 am

      Thanks for the link, but creating full-blown DSLs in Java is probably the last I’d like to do 😉 In fact, I’ve pretty much moved away from textual DSLs altogether, in favor of projectional DSLs; see: http://www.mas-wb.com/

      The whole reason for creating this parser (which I should’ve stated more clearly in the blog, obviously) is to have something that’s (1) working, (2) self-contained and (3) debuggable. To be able to debug an Xtext-based expressions parser requires quite a lot of knowledge of Xtext which prohibits understanding the expression parsing itself.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: