Thursday 9 August 2007 — This is more than 17 years old. Be careful.
Recently, I had two demonstrations of the pitfalls of weak typing.
First, my son Max was working on a simple Flash game. He asked for my help fixing it, because the character would move left, but it wouldn’t move right. His code looked (roughly) like this:
if (Key.isDown(Key.LEFT)) {
guy._x -= "10";
}
if (Key.isDown(Key.RIGHT)) {
guy._x += "10";
}
The problem here is that the _x attribute is an integer. When subtracting the string “10” from an integer, the weak typing coerces the string to an integer, and the subtraction moves the character left. But when moving right, the integer is added to a string, which is a valid string operation, so the integer is coerced to a string, and the two strings concatenated. Setting the _x position to a string doesn’t move the object, so the character doesn’t move right.
Apart from the usual mystifying behavior of weak typing, the bizarre thing here is how two cases which seem completely symmetric in fact have very different results. Strings have a plus operator, but not a minus operator, so the helpful weak typing chose different paths for the two cases, resulting in the strange left-but-not-right bug.
Changing the “10” constants to integer 10’s fixed the problem, of course, since it meant that all operations were the expected integer operations.
The second example was in some JavaScript code designed to speed up a slow calculation. The cache is a map from strings to lists of objects, but the calculation could return nothing, which was also important to cache, so a string object ‘-‘ was inserted in its place:
var answer = this.cache[question];
if (!answer) {
answer = long_expensive_calculation(question);
if (!answer || (answer.length == 0)) {
this.cache[question] = '-';
return null;
}
else {
this.cache[question] = answer;
return answer;
}
}
if (answer == '-') {
return null;
}
return answer;
This code speeded up the calculations, but still took much longer than it seemed like it should. The cache had a really good hit rate (99%), so we only had to look at the path where the cache found the answer. But all it does is look up a value in a hash, compare the value to a string, and return the value. How can that take too long?
The answer lies in the weak typing of that equality check near the bottom. The answer from the cache is a list of objects. To compare that against a string, JavaScript converts the list to a string, then compares the strings. That string conversion was consuming all the time, and was completely unnecessary. If the answer wasn’t a string to begin with, we didn’t need to do the comparison at all.
Changing the comparison to:
if (typeof(answer) == 'string' && answer == '-') {
sped up the function by a factor of about 10.
BTW: this function is more complicated than it had to be. The simpler approach, which avoids the sentinel value and its string comparison, is:
var answer = this.cache[question];
if (typeof(answer) == 'undefined') {
answer = long_expensive_calculation(question):
if (!answer || (answer.length == 0)) {
answer = null;
}
this.cache[question] = answer;
}
return answer;
I use Python, which doesn’t do these sorts of magic conversions, but it also forces me to explicitly convert ints to floats if I want a float answer, which is also a pain. I’d kind of like a middle ground: implicit conversion among numeric types is ok, but not between numbers and strings.
Comments
To join two strings:
$a = "hello " . "world";
But it converts strings to numbers if you use + - etc.
eg.
$a = "1" + 2;
A string which is not a number turns into 0.
So for a python person... (and probably others) this php is confusing.
$a = "hello " + "world";
That will not make $a === "hello world" at all.
In javascript a string which is not a number turns into NaN.
Pythons behaviour with strings and the + - operators is almost as confusing.
In a lot of cases using separate operators for adding sequences is useful.
Separate operators for adding sequences is more explicit. However using the + operator to join strings is what is expected.
I really really hate the implicit conversion of strings to ints... it's an awful idea. The code is assuming way too much about what I mean.
Operator overloading is almost always confusing, and when you do it with two seemingly random types for some special purpose, it's just a bad idea.
At least with two types that are the same (thus, strong typing = good stuff), then you have some expectation that the user knows what they're doing... both objects either have a + operator or not.
But with implicit casting (weak typing), one thing could get turned into something else.
What if you have foo + bar, and bar can be implicitly cast into multiple types, each of which have a different meaning when used with the + operator on class foo? Awful.
Given to a newbie:
2 + "3" -> 5 is the expected result.
The original example in flash by a child programmer shows this (but doesn't prove it). I think more study would prove this is expected by newbies.
implicitly turning ints into floats is a little dangerous, but sometimes expected. The whole divide int by another int implicitly turning the result into a float is maybe 50/50 expected. Definitely not expected by most int using people, but expected by people new to ints/floats and people just expecting a number result... not a int result.
If you got a string from somewhere and want to try to add it to something, first do some explicit conversion, verify that it is indeed convertible to a number, and *then* add the two ints together.
The code you write will be more robust, more legible, and more maintainable.
int AddToUIValue(int x)
{
try
{
return x + Converter.ConvertToInt32( textbox.Text );
}
catch ( ConversionException )
{
MessageBox.Show( "Text in box must be a number.");
}
}
Even if you're a newbie and don't do try/catch stuff:
int AddToUIValue(int x)
{
return x + Converter.ConvertToInt32( textbox.Text );
}
At least then the programmer is forced to realize there's a conversion there, and ConvertToInt32 should have documentation telling you it can throw and under what circumstances.
from __future__ import division
Now, division, the most common case of converting int to float, implicitly converts ints. This will be default in 3.0, and we can use the // operator to do floor division, if we really want to.
if (typeof(answer) == 'string' && answer == '-')
Try doing this:
if(answer === '-')
This should negate the usefulness of the typeof and demands identity instead of equality. PHP has the same operator for when you don't want type conversions to be done. It is especially useful when you want to make the distinction between zero and false.
Consider that the + operator is treated as actual different operations under different conditions: concatenation for strings, addition for numbers. Concatenation and addition are not at all analogous, especially since integers certainly can be concatenated.
The bug your son encountered is a result of the design error of overloading the + operator in Javascript (and, as a result, Actionscript). Had there been a dedicated concatenation operator (say, ..), this wouldn't be an issue.
So long as operators perform analogous functions over different types, weak typing works just fine.
Of all of the potential pitfalls of weak typing, I think that the string/integer one is the most annoying and commonly encountered.
Add a comment: