Saturday 9 April 2005 — This is close to 20 years old. Be careful.
Yesterday I wrote about the beauty of the relational model (actually I thought I was just working on a draft, but I published it without realizing, so now this is a two-parter!)
So how did the ADO.NET designers overlook the fundamental power of the relational model’s universal rectangles? They’ve provided some very powerful facilities: A DataTable is an in-memory collection of DataRows. DataTable allows you to use SQL syntax to query the data in memory, tracks which rows have been changed, keeps before and after copies of changed rows, fires events when data changes, and simplifies writing the data back to the database. They’re a very cool way to build an in-memory snapshot of data residing in a relational database.
A DataSet contains a number of DataTables, which can have intra-table constraints enforced on them. Finally, a DataView can provide a filtered subset of the data in a DataTable.
So this is all very nice, but where are the universal rectangles? Conceptually, there are at least three different structures that provide a bunch of rows: DataTable, DataView, and DataRow[] (which is returned by DataTable.Select). But these things are all different, and can’t be used interchangeably. DataTable provides a Select method for querying its data, but DataView does not. Why not? DataViews are constructed with a DataTable and a select expression, but you can’t make one from a DataView and a select expression. Why not? There’s no base class or interface that both DataTable and DataView implement, so you can’t write a method that uses them interchangeably.
Well, you can, but only by treating the data as an IList or IEnumeration implementation, and even then, inexplicably, the rows of a DataTable are DataRow objects, while the rows of a DataView are DataRowView objects, and the two classes have no base in common! (While I’m ranting, shouldn’t DataRowView be called DataViewRow?!) So if you want to treat a DataTable and a DataView the same, the best you can do is to fill your code with ugly downcasts to get at the rows.
And using as stripped-down an abstraction as IEnumeration means that you’ve lost access to the facilities of DataTable and DataView other than the pure data: the column schema, the events, and so on. It’s infuriating. I want to use these classes to build power in my client. I want to use the relational model of interchangeable rectangles to modularize my code and provide abstraction. But I can’t.
The designers of ADO.NET valued the relational model enough to implement the whole thing, including a powerful SQL implementation in memory. How could they have overlooked the basic tenet of the approach?
They clearly value using interfaces to abstract away unimportant details, because they gave us IDataReader for filling a DataTable from a database. When they needed an interface to help them ignore details they didn’t care about, they provided it. But for consumers of ADO.NET, nothing is ignorable. The details of the classes are in your face at every turn. I find it incredibly frustrating.
Comments
I've seen applications horribly distorted by sticking to using DataSets to represent application objects. The problem is that all your code ends up passing around and remembering not People, Modules, Questions, and Responses, but instead random numbers that might be people ids, module ids, question ids, or response ids, who knows? Endlessly looking up a person id over and over, rather than storing a Person reference. Not a pretty sight...
Assuming that we all here like working with objects, ideally we'd be storing our data in an object database. Trouble is that no one seems to have made an ODB that people trust... there are issues of concurrency, querying and schema evolution that I don't think have been dealt with in agreed upon standards. (Stuff that relational databases dealt with some time back...)
So, my past couple of apps have followed the OR mapping route. It's a nice compromise. You still get to work with objects for your basic CRUD, but you can drop down to SQL whenever you need to slice and dice.
This was a roundabout way of saying that, for most of my apps, I would just be working with my business objects anyhow, rather than generic structures like the ones provided by ADO.NET.
I also think the current implementation falls FAR short of "a powerful SQL implementation in memory", but man, I'd sure like one! You can see that's the direction things are going, and as far as I'm concerned we can't get there fast enough. I want mini-SQL Server as a core language feature that ships with the runtime.
Care must be taken WRT to data types, but if you like your SQL...
It's a giant Einbahnstrasse.
Oh, well, maybe next version.
Einbahnstrasse means one-way street.
Add a comment: