SQL hacks: numbers table

Thursday 3 March 2005

I've done SQL work for a number of years, but it's usually been pretty plain vanilla relational data: I haven't had to venture too far into the truly wild and wooly SQL hacks. For example, I'd never heard of a numbers table. It's a table with one integer column, and a row for each integer in a range (for example, 1 through 10,000). The table gets used for all kinds of unusual queries.

But is it a good idea? A quick look at some of these queries gives me the impression that they'd be inefficient:

DECLARE @csv VARCHAR(255)
SET @csv = '1,3,5,7,9,14,36,395'

SELECT n
    FROM numbers
    WHERE CHARINDEX
    (
        ','+CONVERT(VARCHAR(12),n)+',',
        ','+@csv+','
    ) > 0
    ORDER BY n

This will examine each row in the numbers table, and check if the number is in the string. This is bone-headed. In fact, if that exact algorithm were coded in a procedural language, it would qualify as a Daily WTF.

Sometimes the pure relational model nudges you into odd hacks like this, and sometimes it is hard to know the best way to slice a problem. For example, if the alternative to a numbers table is to write a loop in the client, then shuttling all the data to the client may be a bottleneck, and the numbers query may be better in any case.

As with all performance issues, you can take a guess, but the only way to really know is to measure a real system.

BTW: XSLT is another programming model that leads to these sorts of tricks. Both XSLT and SQL give you a fundamentally declarative style of programming. But developers think procedurally. The conceptual mismatch leads to difficulties expressing yourself, and then off-the-wall ideas like a numbers table to make more things possible.

Comments

[gravatar]
Loz Hygate 9:27 AM on 3 Mar 2005

I don't think a comma separated list of numbers has any place in the "pure relational model". In a pure relational model that would be a relation in its own right.

[gravatar]
infidel 10:05 AM on 3 Mar 2005

It can be handy for any number of things. I've only used it once in our production code so far, and granted it was just to see if I could (could've used a PL/SQL loops instead), but it was kind of a fun hack.

It can also be used for things like opening a cursor that is a series of dates:

select add_months('01-Jan-2005', x) as d
from numbers_v
where x < 12;

The alternative to that would be twelve selects unioned together.

[gravatar]
Joe Grossberg 11:29 AM on 3 Mar 2005

Well, at least you can declare variables in that SQL example. XSLT doesn't even afford you that.

[gravatar]
Ned Batchelder 11:39 AM on 3 Mar 2005

XSLT has variables <xsl:variable> lets you defined named values.

Perhaps what you mean is that it doesn't let you change the value once it is set (which throws into doubt the whole label "variable", but whatever).

[gravatar]
Damian Cugley 6:23 AM on 6 Mar 2005

The reason for using comma-separated lists is because SQL does not have relations as first-class values, so you can’t pass lists or relations as parameters to stored procedures or do any number of other useful things. I’ve resorted to using Microsoft SQL Server 2000’s XML support for the same reasons.

The SQL language would be infinitely more convenient if it had sets and rows as first-class datatypes, not to mention an expression syntax analagous to Lisp’s let clauses (defining temporarty names in the middle of a compelx expression). With those features it would be possible to do more in the purely-functional subset of the language, avoiding the clunky procedural stuff.

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.