Database naming

Tuesday 15 July 2008

Jeff Atwood has a good post about database design: Maybe Normalizing Isn't Normal, all about data normalization. I like the quip it ends with: normalize until it hurts, denormalize until it works.

In the comments, a few people quibble with the naming of his tables and columns. It's an age-old debate: are relational database tables named with a singular or a plural? The original proponents of relational design used singular nouns: User, Employee, Manager. But conceptually, a table is a set of things, and so should be plural, no?

On the plus side for singulars, it works better in today's ORM-heavy world. It simplifies the transition from objects to tables, since the class name will be singular. Typically, database table names have to be explicitly specified to make them plural since pluralization is hard to do automatically. At Tabblo, we have a table correctly named stories, but also one incorrectly named addresss.

Another minor point for the singular camp is the case where a single row in the table is actually described by a plural already. Suppose you have a class called UserFlags. What's the table called? If you favor plural table names you'd have to pluralize it again to UserFlagses?

An advantage for plurals is that it makes SQL queries sound right. The statement "select * from users" simply sounds right. Although, if you have to qualify column names, it sounds odd again:

select users.name from users where blahblah

I suspect which side you lean toward will depend on how you were raised, like any religious argument.

On another point, Jeff shows a User table with a user_id column as the primary key. At first I recoiled: shouldn't the primary key be named "id"? But as it happens, I have often made the mistake when typing ad-hoc SQL queries of using user_id in the User table, simply because I've been using it everywhere else.

If I type

select * from stories where user_id = 6

to see user 6's stories, then it's natural to type

select * from users where user_id = 6

to see user 6 himself. Of course it's natural to use a table-qualified id column name for foreign keys, but for a primary key? I've never tried it, so I don't know what pain I might incur further down the road.

BTW: Joel's forum has an an old thread on the issue.

Comments

[gravatar]
mikey 8:46 AM on 15 Jul 2008

Check out the "stack overflow" podcast with Joel and Jeff, great stuff (cheap).

[gravatar]
Steve 9:01 AM on 15 Jul 2008

All my tables use a three-character prefix for the column names, so I would (and do) use "usrID" as the primary key for the user (not users!) table. This allows you to write queries without name qualification.

Foreign keys get a double prefix, so the customer table's foreign key reference to a user would be cstUsrID. It works for me and it's relatively easy to explain, which is all that really matters.

regards
Steve

[gravatar]
Edward Z. Yang 9:14 AM on 15 Jul 2008

Certain frameworks have certainly gone out of there way to go for the pluralized database naming conventions. CakePHP, for example, has an entire "Inflector" class devoted to these conversions. Personally, I have every table to class relationship explicit; with data mappers, it isn't always one to one.

P.S. Some call it best practice to explicitly name your table with every query, so your last query becomes "select * from users where users.id = 6"

[gravatar]
Ian Phillips 11:54 AM on 15 Jul 2008

How about UserFlagSets? Of course, whether you would then want to rename your class to UserFlagSet is an open question.

[gravatar]
Dan Dunn 1:11 PM on 15 Jul 2008

The second point about user_id is particularly interesting. I think about the SQL teaching I was doing at Tabblo (Eric and Jeff in particular) and the concept of item.user_id=users.id was an endless confusion. item.user_id=tag.user_id=attribute.user_id was no problem, they'd get it right away. But drag in that users.id, and you could just see the mental engine jumping the tracks and tumbling over the cliff. For them, at least, user.user_id would have provided a naming consistency that would have smoothed them through some sticky parts.

[gravatar]
Fabian Neumann 2:38 PM on 15 Jul 2008

Another advantage of chosing the "tablename_id" style is that you can use the USING shortcut for JOINs. Instead of "JOIN ... ON (t1.id = t2.user_id)" you can write "JOIN ... USING (user_id)".

[gravatar]
Jeff Atwood 12:03 AM on 16 Jul 2008

> addresss

I don't know why, but this is hilarious to me! Great post as usual. Another thing in the endless list of things for programmers and dbas to argue about.

[gravatar]
Ben Finney 1:09 AM on 16 Jul 2008

I've always found it simplest to think of a table definition as a defintion of a type of row, just like a class definition is a definition of a type of object.

Then it's never in question: the name of the *type* is singular, hence the name of the class or table is singular.

[gravatar]
infidel 12:59 PM on 16 Jul 2008

Where I hail from, we use "oid" as the PK on every table (as in "object id"). Joining some_table.user_oid to users.oid has never been an issue. The last place I was at did the same but with just "id". I can sympathize with the desire for consistency (user_id = user_id), but typing [table].[table]_id starts feeling redundant quickly, especially with longer table names.

We use singular table naming except for certain conflicting cases ("user" is a built-in function name in Oracle and it won't let you name a table that, so we're stuck with "users" - strangely enough, Oracle does let you use "role" as a table name)

[gravatar]
Kearns 2:10 PM on 17 Jul 2008

at my last job we had endless opinions on this subject. Now I have no coworkers (in dev that is) so it's all singular, no prepended type, easy on the underscores, and PKs are TableID. Except of course legacy schema which haunts me every day.

[gravatar]
Marc 9:11 AM on 13 Aug 2008

The funny thing is, I have never working on a system of any significance that actual had a consistently-applied standard.

[gravatar]
macdba 11:35 PM on 24 Nov 2008

I would lean towards singular table naming, simply because when designing, I like to think of the single entity I am describing. So for a table of users, I am thinking about how a single user would be represented, so the table name would be User. For collections of items, it can get confusing. For example, Operating Instructions. Does a single operating instruction contain one instruction, or is it a single collection of operating instructions? I think it could be both. Operating Instructions could contain a single instruction, or a set of instructions to be grouped under a single id.

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.