Database naming

Tuesday 15 July 2008

Jeff Atwood has a good post about database design: Maybe Normalizing Isn’t Normal, all about data normalization. I like the quip it ends with: normalize until it hurts, denormalize until it works.

In the comments, a few people quibble with the naming of his tables and columns. It’s an age-old debate: are relational database tables named with a singular or a plural? The original proponents of relational design used singular nouns: User, Employee, Manager. But conceptually, a table is a set of things, and so should be plural, no?

On the plus side for singulars, it works better in today’s ORM-heavy world. It simplifies the transition from objects to tables, since the class name will be singular. Typically, database table names have to be explicitly specified to make them plural since pluralization is hard to do automatically. At Tabblo, we have a table correctly named stories, but also one incorrectly named addresss.

Another minor point for the singular camp is the case where a single row in the table is actually described by a plural already. Suppose you have a class called UserFlags. What’s the table called? If you favor plural table names you’d have to pluralize it again to UserFlagses?

An advantage for plurals is that it makes SQL queries sound right. The statement “select * from users” simply sounds right. Although, if you have to qualify column names, it sounds odd again:

select users.name from users where blahblah

I suspect which side you lean toward will depend on how you were raised, like any religious argument.

On another point, Jeff shows a User table with a user_id column as the primary key. At first I recoiled: shouldn’t the primary key be named “id”? But as it happens, I have often made the mistake when typing ad-hoc SQL queries of using user_id in the User table, simply because I’ve been using it everywhere else.

If I type

select * from stories where user_id = 6

to see user 6’s stories, then it’s natural to type

select * from users where user_id = 6

to see user 6 himself. Of course it’s natural to use a table-qualified id column name for foreign keys, but for a primary key? I’ve never tried it, so I don’t know what pain I might incur further down the road.

BTW: Joel’s forum has an an old thread on the issue.

Comments

[gravatar]

Check out the "stack overflow" podcast with Joel and Jeff, great stuff (cheap).

[gravatar]

All my tables use a three-character prefix for the column names, so I would (and do) use "usrID" as the primary key for the user (not users!) table. This allows you to write queries without name qualification.

Foreign keys get a double prefix, so the customer table's foreign key reference to a user would be cstUsrID. It works for me and it's relatively easy to explain, which is all that really matters.

regards
Steve

[gravatar]

Certain frameworks have certainly gone out of there way to go for the pluralized database naming conventions. CakePHP, for example, has an entire "Inflector" class devoted to these conversions. Personally, I have every table to class relationship explicit; with data mappers, it isn't always one to one.

P.S. Some call it best practice to explicitly name your table with every query, so your last query becomes "select * from users where users.id = 6"

[gravatar]

How about UserFlagSets? Of course, whether you would then want to rename your class to UserFlagSet is an open question.

[gravatar]

The second point about user_id is particularly interesting. I think about the SQL teaching I was doing at Tabblo (Eric and Jeff in particular) and the concept of item.user_id=users.id was an endless confusion. item.user_id=tag.user_id=attribute.user_id was no problem, they'd get it right away. But drag in that users.id, and you could just see the mental engine jumping the tracks and tumbling over the cliff. For them, at least, user.user_id would have provided a naming consistency that would have smoothed them through some sticky parts.

[gravatar]

Another advantage of chosing the "tablename_id" style is that you can use the USING shortcut for JOINs. Instead of "JOIN ... ON (t1.id = t2.user_id)" you can write "JOIN ... USING (user_id)".

[gravatar]

> addresss

I don't know why, but this is hilarious to me! Great post as usual. Another thing in the endless list of things for programmers and dbas to argue about.

[gravatar]

I've always found it simplest to think of a table definition as a defintion of a type of row, just like a class definition is a definition of a type of object.

Then it's never in question: the name of the *type* is singular, hence the name of the class or table is singular.

[gravatar]

Where I hail from, we use "oid" as the PK on every table (as in "object id"). Joining some_table.user_oid to users.oid has never been an issue. The last place I was at did the same but with just "id". I can sympathize with the desire for consistency (user_id = user_id), but typing [table].[table]_id starts feeling redundant quickly, especially with longer table names.

We use singular table naming except for certain conflicting cases ("user" is a built-in function name in Oracle and it won't let you name a table that, so we're stuck with "users" - strangely enough, Oracle does let you use "role" as a table name)

[gravatar]

at my last job we had endless opinions on this subject. Now I have no coworkers (in dev that is) so it's all singular, no prepended type, easy on the underscores, and PKs are TableID. Except of course legacy schema which haunts me every day.

[gravatar]

The funny thing is, I have never working on a system of any significance that actual had a consistently-applied standard.

[gravatar]

I would lean towards singular table naming, simply because when designing, I like to think of the single entity I am describing. So for a table of users, I am thinking about how a single user would be represented, so the table name would be User. For collections of items, it can get confusing. For example, Operating Instructions. Does a single operating instruction contain one instruction, or is it a single collection of operating instructions? I think it could be both. Operating Instructions could contain a single instruction, or a set of instructions to be grouped under a single id.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.