Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Regexp for tokenization does work. This entire essay boils down to the fact that you can always postprocess matches and in this case that corresponds to tossing unwanted tokens out.


Yes, tokenization is regular.

Parsing the tokenization result of a Dyck language still requires as context-free grammar.

It's not a badge of honor or a great trick to try that with regular expressions. It is using the wrong tool for the job.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: