Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The article says that PEG handles non-deterministic grammars by becoming non-deterministic itself.

Wait, where? As noted here, PEGs are certainly deterministic, so I just double-checked the article: it doesn’t claim that PEGs are non-deterministic. In fact, it says that “PEG [is] always linear”, and (way before that), “all non-deterministic parsers can run in worse than quasi-linear time”. So the article indirectly states that PEGs are, in fact, deterministic.

PEG’s problem isn’t non-determinism; it’s PEG’s linear memory requirement in the length of the input (rather than the parse tree depth), as well as the fact that it fails to parse some nondeterministic (but unambiguous) grammars.



> PEG’s linear memory requirement in the length of the input

What kind of PEG are you talking about? At it's core, PEG is syntax sugar over recursive descent, and has the same memory requirements. Memoization, maybe?

That said, I wouldn't use PEG for anything else than LL(1). Add a second level with precedence climbing, and you're set for a pretty big class of languages.


> What kind of PEG are you talking about?

I’m talking about the Packrat algorithm which, as far as I know, is the only practically used PEG algorithm. The memory requirement comes from the fact that the algorithm handles partial match failures without having to recompute intermediate steps. By comparison, a bog-standard recursive descent parser without memoisation has exponential worst-case runtime [1].

[1] https://stackoverflow.com/a/16897246/1968


Last time I checked Packrat had nothing to do with memoization. Packrat parsing is the same as precedence climbing, very similar to the shunting yard algorithm, except it leverages the call stack to simplify the code.

And last time I checked, PEG was syntax sugar above recursive descent, just like parser combinators. I implemented¹ the damn thing, so I think I know what I'm talking about. Somewhat. Memoization is just a way to go beyond LL(1), and handle left recursive grammars, it's not not mandatory.

[1]: http://loup-vaillant.fr/projects/metacompilers/


The original Packrat parser explicitly defines it as using memoisation:

> Packrat parsing provides the simplicity, elegance, and generality of the backtracking model, but eliminates the risk of super-linear parse time, by saving all intermediate parsing results as they are computed and ensuring that no result is evaluated more than once.[1]

And just to get a potential misunderstanding out of the way: I’m not claiming that PEGs are formally distinct from recursive descent: According to my understanding (but unlike you I’ve never implemented the packrat algorithm itself … though I’ve written plenty of recursive descent parsers and some PEG-based parsers) there’s a 1:1 correspondence between nonterminals in PEG, and a function in the recursive descent parser. The difference is that one is a formal grammar for describing parsers, while the other is the actual parser implementation.

[1] https://paperpile.com/app/p/3c6d20f0-ab15-0b13-b809-1fdf4ac7...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: