Regexp for *tokenization* does work. This entire essay boils down to the fact th... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		lifthrasiir on July 8, 2021 \| parent \| context \| favorite \| on: The Greatest Regex Trick Ever (2014) Regexp for tokenization does work. This entire essay boils down to the fact that you can always postprocess matches and in this case that corresponds to tossing unwanted tokens out.

beders on July 9, 2021 [–]

Yes, tokenization is regular.

Parsing the tokenization result of a Dyck language still requires as context-free grammar.

It's not a badge of honor or a great trick to try that with regular expressions. It is using the wrong tool for the job.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact