Friday 25 January 2013 — This is close to 12 years old. Be careful.
A popular pastime among programmers is to make fun of programming languages, or at least the one you choose not to use. For example, Gary Bernhardt’s 5-minute talk WAT is all about unexpected behavior, mostly in Javascript.
Today brought another example of surprising Javascript behavior:
> ['10', '10', '10', '10', '10'].map(parseInt)
[ 10, NaN, 2, 3, 4 ]
I looked at this and thought, like most others, “WAT??” I wanted to understand how Javascript produced this result, so I read up on Javascript’s map() function. Once I read the docs, it was clear what was going on.
In most programming languages, the map function accepts a function of one argument, invokes the function for all of the values in an array, and produces the array of results. Javascript doesn’t work quite that way.
In Javascript, the map function accepts a function of three arguments. For each element in the array, the function is passed the element, the index of the element, and the entire array. So this map function makes these function calls:
parseInt('10', 0, ['10', '10', '10', '10', '10'])
parseInt('10', 1, ['10', '10', '10', '10', '10'])
parseInt('10', 2, ['10', '10', '10', '10', '10'])
parseInt('10', 3, ['10', '10', '10', '10', '10'])
parseInt('10', 4, ['10', '10', '10', '10', '10'])
The second argument to parseInt is the base to use when converting the string to an integer. A value of 0 means, “do the right thing,” so the first result is 10. A base of 1 makes no sense, so the second result is NaN. And 2, 3, and 4 produce 2, 3, and 4. Javascript silently ignores extra arguments, so the array passed as the third argument has no effect.
So is Javascript’s map wrong? It behaves differently than the map found in lots of other languages like Python, Ruby, Lisp, Perl, Haskell, and so on. But it isn’t wrong.
Working in more than one language, it’s frustrating dealing with their differences. New Python learners chafe at the fact that Python names work differently than C variables. They want to know if function arguments are call by value or call by reference (neither). I saw a person on IRC once who was upset that Python lists were called lists instead of arrays.
Languages are different, that’s why we have more than one of them. Language designers have to strike a balance between familiarity and innovation. We’d be pretty surprised by a language that used something other than “+” for adding numbers together, for example. Eventually though, the new language will diverge from the old, or what was the point?
Javascript’s map function feels a little clunky: it’s focused on integer-based iteration rather than on pure functional construction. But it can do things the other maps can’t easily, like create an array of differences between elements, or a running total, and so on. Other languages have other solutions to those problems.
This isn’t to say that all languages are equal, there are better and worse, of course. But too often I hear people ranting about a language being stupid for some decision, without bothering to find out why it was done that way, and what benefit they might get from it.
To which, I say: WAT!?
Comments
Dirkjan: Is your point that a 'map' like the one described here should be called something different, in the spirit of of Common Lisp's family of map functions? (mapcar, mapc, maplist, mapl, mapcan, and mapcon.)
To be honest, when I think of generic 'map', I think of Scheme's version which takes an n-ary function followed by n lists, each of of length M. The result is a list of length M, composed by calling the function with the 'm'th element of each list. It's always annoyed me that Python's 'map' doesn't generalise like this. (but it's easily to write one yourself which does.)
e.g.
(map add (list 1 2 3) (list 4 5 6))
-> (list 5 7 9)
1) parseInt() takes a radix as an optional second argument;
2) calling a two-argument function with three arguments doesn't raise an error;
3) map() isn't a pure function with respect to the elements of its array argument. (It's obviously pure with respect to the array taken as a whole.)
1 is fine. 2 is weird.
3, however, is stunning to me. Most of my experience is with imperative languages, not functional ones. I believe map() is a relatively recent addition to JS, so I tend to see it as part of a modern trend of bringing functional features into general/imperative/OO languages -- you can see this process pretty clearly in Python's adoption of FP features. With that in mind, map() isn't part of a Javascript tradition, and shouldn't get a pass for being surprising because "languages are different". Javascript's map() is a loanword. In natural language, loanwords usually drift in definition, and that causes ambiguity problems. Here, it's a disaster.
@Jonathan: btw, no need to be annoyed at Python's map, it does take N iterables and a function of N arguments!
like this?
>>> from operator import add
>>> map(add, [1,2,3], [4,5,6])
[5, 7, 9]
var set = ['10','10','10']
, result = [];
for (var i in set) result.push (parseInt(set[i]));
That's not substantially longer than a more realistic interpretation of your example, and the ECMAScript designers (and myself) would argue that it's much clearer than using `map`. If you think `map` is clearer, odds are this isn't even close to your least favorite thing about javascript.
I always thought I was a big python fan... then I tried Node.js and it was like dawn breaking.
and ignoring arguments without a compiler/interactive console warning. why don't i just run around my tiny apartment with a knife in my hand screaming and blind-folded?
we've had 40+ years of type theory and language design history, why should I use a language that doesn't learn from that extensive knowledge base?
Javascript would be a better language if it adopted a similar principle. Every language would, actually. Innovative features of a language only count as an improvement if they make the programmer's life better. If they suck and cause confounding bugs that requiring going deep, straight past the API of the language and right down to the implementation level to understand what is going wrong, then that language feature is basically a downgrade to the language as a whole and it shouldn't exist.
@dbg: you say, "familiar things should behave in familiar ways," but familiar to who? Should Python constructs behave as a C programmer would expect them to, or the way a Lisp programmer would? What about an APL programmer or an Erlang or Haskell or Bash programmer?
Python lists are kind of like C arrays, and kind of like Lisp lists, but also different from both of them. Should they be called lists or arrays? What should they be implemented like? Is the behavior most important, or the implementation? Nothing is obvious here, which is why it's called programming language "design," and not just programming language "implementation."
You make bold statements as if they are clear directives, but they are not. They are another set of conflicting goals that the designer has to balance.
I think the confusion comes from using a function you may not be familiar with (parseInt) as much as the issue with how map() works.
Javascript obviously doesn't have a compiler to give you such a warning, so it would have to be something in an interactive console -- and Javascript didn't have an interactive console for a nontrivial chunk of its meaningful lifetime.
Changing that behavior now would probably be very weird. Given that's how Javascript works, code has grown up around that; I have used the fact that extra arguments are ignored many times; you can still access the arguments via the 'arguments' variable, and doing so lets you take unnamed positional arguments on subclasses that can be passed to superclasses.
I'm not saying any of this is right, but as someone else pointed out: This is Javascript. If you want to use it, you need to get used to it; this 'quirk' is hell most of the time, occasionally useful, but regardless, it is part of the language and tools around it today.
Until you ignored your own logic and linked to that discredited PHP "fractal of bad design" post. It has been destroyed, repeatedly (links below) but, more importantly here, the dismissal of PHP as a "worse" language ignores the excellent advice in this post.
Worse when?
Instead of using this comment as part of a language flame war (those links are below, for those who care), I would like to apply this post's excellent critique of ignorant dismissals of languages to this post's ignorant* dismissal of PHP.
Why might someone choose PHP? Let's explore.
Imagine you are about to start building a web application which you hope will be deployed on thousands of servers, by thousands of users. You would do well to make the setup as easy as possible. Perhaps the single most important factor for a large segment of your potential customers will be ease of installation.
With PHP, installation is just SFTP-ing your files. That is it. No server configuration, no anything. An update? Change the file, and FTP it back to the server. No restarting the server. No rebuilding anything. No nothing.
This is one of the killer features in PHP, the language. Ease of use for server "administrators." For the team that creates a (hopefully!) widely deployed web app, these people are referred to as users or customers. If they can install your software easily, it is much more likely they will.
This feature of PHP plays no small part in the success of widely deployed web apps, such as WordPress, WikiMedia, Joomla, Drupal, etc.. In fact, I would go so far as to say that if your goal is a widely deployed web app, then PHP is the best language out there.
PHP has other great features absent in other languages, described in the links below. Worth learning about.
To quote an insightful author:
"Too often I hear people ranting about a language being stupid for some decision, without bothering to find out why it was done that way, and what benefit they might get from it."
Language war stuff
I love Python. I would just never choose it if my goal was a widely deployed web app. FWIW, probably because I don't know it well enough, I actually prefer PHP to Ruby.
I program in many languages, mostly PHP, C++, JavaScript, and, recently, Clojure. I actually (mostly) like all of them. No matter which I am using, I always hit things where I think "this would be so much easier with [fill in the blank]" but they each are best at something. PHP is best for building widely deployed web apps.
* The links, for those who would like to learn about the good parts of PHP, why the fractal article is crap, and the reason PHP is so often chosen (such as for the software that handles this very comment):
http://forums.devshed.com/php-development-5/php-is-a-fractal-of-bad-design-hardly-929746.html
http://blog.ircmaxell.com/2012/04/php-sucks-but-i-like-it.html
http://fabien.potencier.org/article/64/php-is-much-better-than-you-think
I point to PHP as a worse language because it is internally inconsistent, and has features that make it difficult to write robust software.
I argue in this post for learning a language for itself rather than assuming it behaves like $SOME_OTHER_LANGUAGE. Criticizing Javascript because it behaves in a surprising way when you don't read the docs is silly. That doesn't mean anything goes. PHP has bad features.
PHP seems not designed so much as simply accreted. There is little consistency in the standard library, there are features in the language that fly in the face of everything people have learned about language design, and so on.
Some things are bad, just not all the things that confused doc-skimmers.
Your critique of PHP is true. PHP doesn't feel designed. I tend to use the word 'evolved' instead of 'accreted' but I hear you. It is not designed well.
But.
PHP does evolve in a way that designed languages can not. Although quietly, PHP has been fixing issues and adding features faster than any other popular language out there. Not a new major version, but lots and lots of great stuff in the minor versions. In the past 6 years 5.2, 5.3, and 5.4 were all huge. 5.5 is coming soon, with lots more goodies and fixes. I believe that rapid evolution is good for a web-centric language. Rapid evolution is hell for consistency and produces random gotchas, which in turn is hell on my brain*, but rapid evolution fits the fast pace of the web.
To close the circle, I think your original post is brilliant. Languages are different and that is good. People should learn them instead of calling them stupid. I just kinda feel that the critique of PHP falls (somewhat) into the that trap.
The things you want, and are used to in other languages, PHP lacks. Lack of consistency. Not well designed. But these are a language choice and in return we get rapid evolution.
My personal stance: languages don't matter nearly as much as every other aspect of a project. The developers, their relationship, the strategy, and the enthusiasm all play a larger role in a projects success or failure than the language choice. And, of course, the thing that matters most is the idea: is the project actually useful?
Honestly, although PHP is probably the best choice for a widely deployed web app (see my other comment above), any popular open source MVC framework in any modern language is probably the right choice for over 90% of new web projects, if you control the infrastructure. The language simply isn't as important as we make it out to be, but people do grow irrationally attached to, and defensive of, their languages. Which is fine. Perhaps that is why I am spending my Sunday morning defending PHP to strangers.
Thanks for a thought provoking weblog. I got to your blog through a twitter or reddit link, but I just bookmarked it. For better or worse, I will be back. ;)
PS I knew the comment system was built in PHP because of the excellent Wappalyzer firefox plugin, which I can't recommend enough. If you do web work, you will love it.
http://wappalyzer.com/
* The hell on my brain due to language inconsistencies and random gotchas is greatly minimized by PHP's excellent web-based documentation. Best I have ever seen, down to the invaluable reddit style voting and sorting of the comments.
I do not program in PHP, however, and most of my knowledge of the language comes from http://phpmanualmasterpieces.tumblr.com
Well done, sir.
Thank you for your reply. I don't think you are really spending alot of time thinking about the meaning of principle of least astonishment. Its a cultural norm, not a technical specification. You point out that different languages have different cultural expectations, but this is only true to a certain extent. We see a repeated pattern of languages being designed to conform to the expectations of an existing language culture. Thats the reason why Java (and Javascript for that matter) has curly braces and semicolons.
You give an example of Python lists perhaps being an astonishing construct. This is completely false. Python lists are lists. They are called lists. That is the name of them because they ARE lists. The expected behavior of a list is that it supports operations like insert, append, prepend, extend, etc. And this is what Python lists support. The new part is that there is syntactic sugar that lets you use the list (any Sequence type in Python actually) with the [] slicing syntax. That new part ALSO conforms well to the principle of least astonishment though, since it is designed to behave very very similarly to the array access operator from C. Again, only the new parts do something new.
Javascript breaks cultural expectations all over the place, which is why there are criticisms of it along the lines of the WAT video. Every language breaks SOME expectations, but most at least try to stay non-astonishing when they can. JavaScript doesn't even try.
That doesn't make it a bad language. It makes it a bit harder to learn and harder to debug, but those are not the only important things. There's alot of good about Javascript, but its frequent violation of cultural norms and expectations is not one of them.
I'm not arguing that Javascript is on the good end of the spectrum, I just think that things are far more subjective than a lot of engineers would like to believe. It's easy to get into a mindset of "everyone knows C", or whichever language you feel is most important, but it doesn't mean everyone does know C.
> ['10','10','10','10','10'].map(Number)
[ 10, 10, 10, 10, 10 ]
At least, that's what I've always used.
Add a comment: