wcscmp vs StrEqual

Tuesday 18 October 2005

I know this is a small thing, but it snagged my brain this morning. I hate the C standard library function wcscmp. First, because the name is so crappy (it stands roughly for Wide Character String CoMPare). Second, because although it returns a true comparison (which of the two arguments is greater), it's almost always used to compare for equality, and therefore has to be negated. It returns zero if the strings are equal, so you end up writing this:

if (!wcscmp(s1, s2)) {
   // Do something if s1 and s2 are equal...
}

Or worse, you want to know if they are not equal, so you leave off the negation. When I see wcscmp, I think, "we're comparing to see if the strings are the same", but then I have to carefully navigate that negation to get the sense correct.

If we're not going to have a real string class with an equality operator, I much prefer to have a StrEqual function:

inline bool
StrEqual(const wchar_t * s1, const wchar_t * s2)
{
    return !wcscmp(s1, s2);
}

Then I can write what I mean. But others feel that at least wcscmp is standard. I'll take readable and straightforward over standardized obfuscation any day.

Comments

[gravatar]
Nate Finch 9:17 AM on 18 Oct 2005

I'm with you, Ned. Just because wcscmp is the standard doesn't mean it's good. Is anyone really going to look at the code below and *not* understand it? I think not.

if ( !StrEqual( sname1, sname2 ) )
{
// do something if not equal
}

[gravatar]
Steve 9:44 AM on 18 Oct 2005

If I'm using C I write

if(0 == strcmp(str1, str2))

rather than using !strcmp() - but I avoid having to write C if at all possible.

[gravatar]
sil 11:54 AM on 18 Oct 2005

I'm with Steve here; personally, I think that 0 and False shouldn't be treated as the same thing in this sort of context, and using wcscmp()'s return of 0 as a Boolean *is* obfuscatory, but then that's why I'm not a C hacker. :)
I'm inclined to think that if (wcscmp(s1,s2) == 0) isn't anywhere near as obfuscated, even though it's the same thing, because you're not mentally expecting the function to return "are they equal or not", you're expecting it to return a particular value or not. That that value *happens* to also be the Boolean false is a coincidence.

[gravatar]
Joe 12:01 PM on 18 Oct 2005

Another vote for Ned, StrEqual is the correct thing to do. But then, I'm crazy enough to name things areStringsEqual(string1, string2)

[gravatar]
Keith 12:18 PM on 18 Oct 2005

I'd tend to do "if(wcscmp(s1,s2) == 0)". Then at least you're making a "positive" comparison rather than "negative" (i.e. "!wcscmp"). But man, 'StrEqual' is a lot nicer than 'wcscmp'.

[gravatar]
Len Holgate 2:05 PM on 18 Oct 2005

I'm with Steve and Sil and Keith. I'd write it as 0 == wcscmp(), it doesn't return false, it returns 0... But, like Ned, I'd most probably do this just the once, inside an inlined helper function called something like StrEqual that returns a bool...

[gravatar]
Ned Batchelder 2:15 PM on 18 Oct 2005

It sounds like we're all roughly on the same page, EXCEPT:

There's no way I'm writing "0 == function()" instead of "function() == 0". I understand the logic of "0 == foo" (it prevents an accidental assignment instead of comparison), but "function() = 0" won't compile, so we don't have to prevent against it by twisting things backwards!

Yuk.

[gravatar]
Nate Finch 3:26 PM on 18 Oct 2005

Actually, I prefer 0 == function(), because it's a lot easier to tell that there's actually a compare in there, rather than thinking maybe function() returns a boolean.

For example:

if ( FunkyFunctionToDoStuff( variable1, variable2, variable3 ) == 0 )

is a lot harder to scan over in the code than this:

if ( 0 == FunkyFunctionToDoStuff( variable1, variable2, variable3 ) )

To get the gist of the first one, you need to read the entire line, for the second one, you only need to read 10 or so characters (assuming you don't have a lot of "FunkyFunction"s :)

[gravatar]
Bob 4:25 PM on 18 Oct 2005

I'm with Ned on his comments about backwards syntax. In my opinion it's ugly.

I used to work with someone who used to switch *all* of his logic around to put the variable at the end of the expression. He'd write code like this:

if ((0 < foo) && (10 > foo))

rather than

if ((foo > 0) && (foo < 10))

As far as (func() == 0) vs. (!func()), I prefer the first. Mostly because I've been mostly working in languages that don't allow int and bool(ean) to be treated as the same thing.

[gravatar]
Nate Finch 5:08 PM on 18 Oct 2005

For non-symmetrical binary operators like greater-than and less-than, I agree, it's easier to understand when the variable is on the left. However, for symmetrical operators like equality, I really can't see how 0 == foo is any harder to read than foo == 0.

[gravatar]
andrew 5:14 PM on 18 Oct 2005

Ahh yes Bob. I remember that person too. I used to spend time contorting my brain trying to figure out what he was saying.

[gravatar]
Bob 8:01 PM on 18 Oct 2005

Nat, you've making your argument a little weaker. Why is this:

if ( FunkyFunctionToDoStuff( variable1, variable2, variable3 ) == 0 )

different from:

if ( FunkyFunctionToDoStuff( variable1, variable2, variable3 ) > 0 )

Seems that you ought to reverse them both if you want to see the operator earlier in the expression.

Regarding the (0 == func()) syntax. Code isn't just written to be compiled, it's written to be read. To me, the reverse comparison looks stylistically "wrong". It's entirely equivalent to the compiler but I prefer the classic comparison syntax. It's no different than brace style or any other stylistic code layout issue.

To avoid "layout style" wars, any reasonably large project should adopt a coding style guide as early as possible. Not to stifle creativity -- it's to make sure that everyone on the project can quickly and accurately read one another's code without tripping over stylistic differences.

[gravatar]
Bob 8:02 PM on 18 Oct 2005

Andrew: yeah, that's the guy. Early in his career he wrote a lot of FORTH. That syntax would warp anyone.

[gravatar]
Leonardo 8:59 PM on 18 Oct 2005

Why not use a macro? and maybe naming it isStrEqual

[gravatar]
Ned Batchelder 9:42 PM on 18 Oct 2005

What's the advantage of a macro? By putting the function definition in a header file, and declaring it "inline", it works out the same, but you don't need to contort yourself lexically.

[gravatar]
Nate Finch 9:28 AM on 19 Oct 2005

Bob... I would like to be able to write it as 0 < FunkyFunction(), but that's too difficult to understand, as others have pointed out. The whole point is making the code easier to read, as you said. Your argument against 0 == FunkyFunction() is that it looks "wrong". Well, it may look "wrong" but it's easier to read and understand quickly.

And yes, at every job I've worked, we've had coding standards which everyone was required to follow. it's definitely a good idea, especially to reign in some of the more messy coders.

[gravatar]
Bob 1:14 PM on 19 Oct 2005

Nat, sure, "looks wrong" is subjective. In my case, (0 == FunkyFunction()) looks "wrong" because I haven't worked on a project where that was the style. It may be "easier to read and understand quickly for you but not for me. That notion of "looks wrong" is important. When reviewing code, we want to be able to allow errors or coding issues to stand out. If someone on the project is using strange syntax (like the guy I mentioned in a previous comment) that's a barrier to understanding for others. But I'm flexible. If I worked on a project where (0 == FunkyFunction()) was the way in which the team wrote predicates, I could adapt.

There are a lot of little decisions for coding style:brace style, indentation, paces vs. tabs, comments, naming conventions, etc. They're subjective but not unimportant. It's highly unlikely that you'll get unanimity on "one true style" but developers should be flexible enough to adapt. On a recent project one of the developers ranted and railed about the choice of brace style. He insisted that his style was more readable and refused to change. Yeah, he was a real team player ;-) But the decision wasn't made arbitrarily. We were adopting an existing style guide and there was a large existing body of code that was already written in that style.

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.