|Ned Batchelder : Blog | Code | Text | Site|
» Home : Blog : April 2011
gets transformed into this "Perl":
I assume the reason //-comments that share a line with code are skipped is to avoid clobbering strings with // in them, though with multi-line strings, even that is not enough to protect them.
Here messages 1 and 5 are found, and 3 and 4 are not. How come? Because Perl's y operator consumes two strings delimited by the next character, in this case a semicolon, so lines 3 and 4 are considered literals rather than code.
But distinguishing between division and regexes is impossible to do at a purely lexical level, and can be quite subtle:
The first line has a regex of /x:3;x<5;y</g, the second has /g/i.
The ECMAScript standard says you need to parse the code, and if you're at a point where a regex literal would be a valid next token, then lex it as a regex, but if you're at a point where a division would be valid, that lex it as division.
The next phase is to determine whether this gets into Django or not. I've prepared it as a patch, but there was already some momentum to replace gettext with Babel, and it's looking like it might all have to wait for 1.4 in any case. As someone who's recently lost time to this bug, I would really rather get something into 1.3.1, so we'll see where that ends up.