C# GOLD Parser Engine
Devin Cook has created
GOLD Parser Builder to build parsers
for programming languages.
Grammar is described using BNF notation. Builder creates a binary file which contains
DFA and LALR parsing tables. One can create a parser in any language using these binary tables.
I have contributed to the GOLD Parser project by creating Gold Parser engine in C#
and creating VBScript grammar.
Key features of the C# GOLD Parser engine 2.1
Parser is designed to be a "pull" parser, where client programs pulls data out of parser.
Parser keeps only minimum information required for parsing.
So, it can be used for parsing large files without allocating a lot of memory.
Parser uses TextReader as a text input. So it can use StringReader or StreamReader classes.
Parser implements "lazy" string allocation for tokens.
So, if token text is not important then no extra string allocation will be done.
There is a special Grammar class which can be initialized from CGT file
once and reused for multiple parsing operations.
All Grammar properties and tables are immutable.
Parser uses special transition vectors for DFA and LALR
processing to make them faster. It is especially optimized for a case where ASCII charset is used.
It uses char value as an index in array.
Parser object has to be created for each parsing string or stream.
Parser has special parse messages to collect comments.
Callback function can be supplied to collect source lines.
What's new in C# GOLD Parser engine 2.1 vs 2.0
Fixed m_lrStack overflow error.
Fixed DFA transition vector logic for small sets with Unicode characters.
Documentation
Online documentation for the engine generated with NDoc 1.3.1
Download
GOLD parser engine version 2.1 .Net assembly:
GoldParserEngine_2.1.zip (This is for .Net 1.1)
Autogenerated documentation for GoldParser.dll
Engine source code GoldParserEngineSource_2.1.zip contains:
Full source code for the engine.
Source code for SimpleInterpreter sample.
Source code for Tree Builder sample.
Generated documentation files in CHM and HTML formats.
Grammar downloads
VBScript grammar.
Triangle grammar from
Programming Language Processors in Java book.
CSV - comma separated value grammar.