Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Lookbehinds and lookaheads aren't rocket science.

Lookbehinds and lookaheads (especially negative lookbehinds) are rocket science.

What is "rocket science?" "Rocket science" is the feeling you get in math class where the instructor explains a proof to you in the clearest possible terms and you just don't get it. You have to listen to the explanation multiple times, preferably in a few different ways, and then you have to sleep on it, and then you get it, maybe.

But "rocket science" isn't just hard to understand. It's a hard problem where the consequences for failure are catastrophic. When you fail at rocket science, a multi-million dollar rocket explodes.

Anyone who's ever tried to teach lookbehinds to a newbie has seen it: you explain how lookbehinds work, and then ask the newbie to create a regex with negative lookbehind, to demonstrate mastery. I've done it a few times, and they never get it right, ever.

At best, they flub the syntax, but even once they get over that, they usually write the worst possible regex: a regex that works correctly on desired inputs but does the wrong thing on the input the regex is designed to reject.

This is a notorious problem with writing regexes, but it's way worse for negative lookbehind, because it's asserting that something isn't there, rather than querying for something that is there.

When I see a regex with negative lookbehind during code review, I ask for unit tests, not just comments. Reliably, regexes get even more complex when unit tests are added, because it's just so damn hard to write a correct regex with negative lookbeind.

I've never used the "trick" from TFA before, but it already sounds way easier to use than negative lookbehinds, and I'm curious to try it.



I agree on unit tests for non-trivial regexes as a general rule, but respectfully disagree on lookaheads and lookbehinds.

Things like greedy vs. non-greedy matching, matching newlines or not, handling Unicode correctly, inserting a capturing group when you actually needed a non-capturing group, making sure your regex works if it matches the start or end of a string, escaping characters -- those can be tricky.

On the other hand, lookaheads and lookbehinds are conceptually extremely straightforward, you just need a cheatsheet to remember the syntax is all.


Ha. Of all the things I learned at university, rocket science was the easiest to get. Quantum mechanics on the other hand sucked.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: