Parsing expression grammars (PEGs) are an analytic grammar formalism. They feature the following operators: * sequencing (e1 e2) * unambiguous choice (e1/e2) * (greedy) repetition (e*) * option (e?) * one-or-more (e+) * positive lookahead (&e) * negative lookahead (!e) They are: * fast to parse (O(n) in time and space, where n is the size of the input) * unambiguous by construction * closed under composition, which allows modular grammars * more expressive than context-free grammars (they can parse L="a"^n"b"^n"c"^n) However, they come with their share of problems: * PEGs can't directly handle left-recursion (not much of a problem in practice) * PEGs suffer from so called "prefix capturing": "a*a" will never match any input ** (this is what you get for defining away ambiguity) * PEGs are bad for specifying languages: just guess what language "S <- aSa / aa" parses ** for the lazy: A string of "a"s with a length which is a power of two! * PEGs can't generate strings, only parse them (due to lookahead operators) ** ''it seems to me that one could 'enforce' discovery of whatever one looked ahead to find'' ** Generating "(&A)B" would involve finding the intersection of A and B. This is an undecidable problem: If it would be possible to find the intersection of two PEGs we would not have issues detecting prefix capture. Prefix-capture is present in "A/B" iff the language generated by "(&A)B" is not empty. See [1] for more information. [1] http://charybde.homeunix.org/~schmitz/pub/modular.pdf ---- CategoryCompilers