ADO.NET's DataRelation ObjectJoining Tables the .NET Wayby William Ryan |
As I've mentioned before, of all of the areas where things have changed in .NET, ADO.NET is probably the biggest (at least it's one of the top 3). In classic ADO, there weren'tDataSets , DataTables or most other ADO.NET objects. In classic ADO, if you needed to join two tables, typically you'd write the SQL statement and include the join and pull back all of the data. By very definition, you are pulling back redundant data when you use this method. This has many drawbacks to it, the most striking of which is performance - after all, it takes a lot longer to pull back Xmb of data than it does X/10mb of data. This is where ADO.NET really shines. But, if you don't join your tables and pull them into a recordset, how do you get the data back? In comes the DataRelation.
This isn't a really difficult concept, but if you aren't familiar with DataSets and DataTables, I'd suggest you read up on before continuing. To begin with, I'm going to assume that you have two tables, Transactions and TransactionDetails. They have a bunch of fields in them, but they are related by a common field, TransactionID. So, the first thing you'd do is pull back all of the relevant data from both tables:
This isn't a really difficult concept, but if you aren't familiar with DataSets and DataTables, I'd suggest you read up on before continuing. To begin with, I'm going to assume that you have two tables, Transactions and TransactionDetails. They have a bunch of fields in them, but they are related by a common field, TransactionID. So, the first thing you'd do is pull back all of the relevant data from both tables:
DataSet ds = new DataSet(); SqlConnection cn = new SqlConnection(ConfigurationSettings.AppSettings("ConnectString"));SqlCommand cmd = new SqlCommand("usp_FillTransactions", cn); cmd.CommandType = CommandType.StoredProcedure; SqlDataAdapter daTransactions = new SqlDataAdapter(cmd); daTransactions.Fill(ds, "Transactions"); //pass in ds as DataSet, and "Transactions" as the table name SqlCommand cmdDetails = new SqlCommand("usp_FillTransactionDetails", cn); cmd.CommandType = CommandType.StoredProcedure; SqlDataAdapter daTransactionDetails = new SqlDataAdapter(cmdDetails); daTransactionDetails.Fill(ds, "Details"); |
Ok, now I have the complete Transactions table and the TransactionDetails table loaded into my DataSet (remember that the DataSet is an approximate abstraction of your database, so you wouldn't normally need more than one for any given app unless the data is coming from different databases).
But, if I were to bind some controls to Transactions and a grid to TransactionDetails without doing anything else, every time I navigated to a different record, Details wouldn't have any idea of what Transactions is doing and vice versa. In order the bind these two to each other, you need to use a DataRelation. So what benefits are we going to derive from this? The first is performance, we will pull over much less data as opposed to using a server side join. The second is validation. Once we add the Relation, we don't have to worry about deleting child records that don't have parents or inserting detail records without a valid Transaction. The third is easy of navigation. Once we Relate these tables, a BindingManagerBase or a BindingContext will automatically reposition the child records whenever the parent changes. Moreoever, we can propogate any changes in the parent table (Transaction) to the Child (TransactionDetails). The importance of this can not be understated. If set up correctly, the programmer doesn't have to worry about writing all the validation code and missing something. It's also a lot easier to write 5 lines of code than it is to roll out your own validation logic (when I speak of validation, I'm referring to it at the database level, you will probably still want to let the users know when they put a Name in a Date field for instance). Moreoever, you don't have to write update logic for this other than the update logic that your DataAdapter employs. You can do it the hard way and ignore the DataRelation, but in all likelihood you'll overlook something that you never intended to occur and risk dealing with bad data in your database. Finally, after the initial performance hit of loading your data (the DataRelation doesn't cause this) which would occur either way, you can show 'child' records without having to requery your database. In every conceivable way, performance will be improved and you will limit possible data errors. And even if you think you can write better code than the well tested DataRelation, the beauty of it is that it doesn't prohibit you from adding your own validation. So regardless of your goal, you can use the DataRelation to your advantage.
Hopefully I sold you on the benefits of this (if not, drop me an email at bill@knowdotnet.com and I can send you copies of about 50 emails I've received from people who refused to use the DataRelation, asked me for help and swear by it now) really great object but you are probably asking, how do I use it? Here it goes:
There are 5 constructors that can model just about anything you have in your database schema. I've heard many people say they don't use DataRelations b/c they haveCompound Keys in their Schema, but as you'll see, this criticism holds no water. The first constructor is probably the one you'll use most often. In our example above, we have TransactionID as a Primary key in the parent table and as it serves as the Foreign Key in the child table.
To build this, we pass in the name we want to use to refer to the relation (just like you name your Relations in SQL Server or Oracle), the Parent Column and the Child Column:
VB.NET: 'For simple single column relation Dim Tran_Detail as New DataRelation("ds", ds.Tables(0).Columns("TransID"), ds.Tables(1).Columns("TransID")) You can also use all Ordinal references or mix them as you see fit. Assuming that TransID was the first column in both Datatables, the following would work the same: Dim Tran_Detail as New DataRelation("MyRelationName", ds.Tables("Transactions").Columns(0), ds.Tables("Details").Columns(0)) 'or Dim Tran_Detail as New DataRelation("MyRelationName", ds.Tables(0).Columns(0), ds.Tables(1).Columns(0)) 'Finally ds.Relations.Add(Tran_Detail) |
C# DataRelation Tran_Detail = new DataRelation("ds", ds.Tables[0].Columns["TransID"], ds.Tables[1].Columns["TransID"]); //For the sake of brevity I'm not going to translate each of the above, but you can switch between the nominal or the ordinal and provided //you use the correct index, it will work the same ds.Relations.Add(Tran_Detail); Like I said, this is the simplest of the constructors. The other overloads are provided below: DataRelation(string, ParentDataColum(), ChildDataColumn()) 'this takes an array of DataColumns, so you could use this constructor the same way we did above DataRelation(string, ParentDataColumn, ChildDataColumn, Boolean) //Where the boolean instructs the DataRelation whether or not to enforce //the constraints. For good reason Constraints are enabled by default, and I'd recommend against setting this to false unless you have a //really good reason to do so...and if you do, don't complain to me when a user does something you didn't intend and your validation code //misses it. //Similarly, there is an Array Based constructor with the Boolean: DataRelations(string, ParentDataColumn(), ChildDataColumn(), Boolean) //this is identical to the one above it except it allows the use of multiple columns aka Composite Keys The final constructor allows you to simply name the tables and the columns, but this will already be done most of the time. For that reason, I'm not going to address, but it's Here if you find the need to use it. |
This is pretty much all there is to it at this point, but I'd like to mention two other things. First, if you don't supply a name for the DataRelation (you still have to declare it and give the variable name, but the first parameter in each constructor is the name that you can also use to reference it), it will supply one for you. Don't do this! While it will compile and work, more than likely it's going to confuse you, and you can bet it will confuse the next guy who uses your code. Second, you can have multiple DataRelations in a DataSet. If you think about it, it would be pretty lame if you could only have one relation in an entire DataSet. However, you can't have multiple datarelations between dataset. Stated simply, If I had 4 tables, Transactions & Details, Customers & ContactHistory, I could have a relation between Transactions and Details as well as one between Customers and ContactHistory in the same Dataset. But I couldn't have 2 Relations between Transactions and Details (and since you can use Compound Keys, there is absolutely no reason to do so).
Ok, so how do you use the other constructors? Let's say that instead of having only TransID as the Key in both tables, let's say that we had a compoun key, TransID, CustomerID and SalePersonID. All three fields exist in both tables. And since we are responsible programmers, we are going to leave the EnforceConstraints property where it should be , ON.
Here's how we'd do it:
VB.NET 'Declare the Columns - even though we have multiple fields, we only have two tables, hence TransactionColumns and DetailColumns Dim TransactionColumns() as DataColumn Dim DetailColumns() as DataColumn TransactionColumns = New DataColum(){ds.Tables(0).Columns("TransID"), ds.Tables(0).Columns("CustomerID"), ds.Tables(0).Columns("SalePersonID")} DetailColumns = New DataColumns(){ds.Tables(1).Columns("TransID"), ds.Tables(1).Columns("CustomerID"), ds.Tables(1).Columns("SalesPersonID")} 'We could also use all norminals, ordinals or any mixture of them 'Add the name and DataColumn arrays to the Relation Dim Tran_Detail as New DataRelation("myRelationName", TransActionColumns, DetailColumns) 'Add the Relation to the DataSet ds.Relations.Add(Tran_Detail) |
C# DataColumn[] TransactionColumns; DataColumn[] DetailColumns; TransactionColumns = new DataColumn[] {ds.Tables[0].Columns["TransID"], ds.Table[0].Columns["CustomerID"], ds.Tables[0].Columns["SalePersonID"]}; DetailColumns = new DataColumn[] {ds.Tables[1].Columns["TransID"], ds.Table[1].Columns["CustomerID"], ds.Tables[1].Columns["SalePersonID"]}; DataRelation Tran_Detail = new DataRelation("myDataRelation", TransactionColumns, DetailColumns); ds.Relations.Add(Tran_Detail); |
Now, we could use the exact same code and specify the EnableConstraints Property to either false or true. To do this, none of the code above would change except for the code in the constructor, to which you'd add a true or false parameter:
VB.NET Dim Tran_Detail as New DataRelation("myRelationName", TransActionColumns, DetailColumns, True) |
C# DataRelation Tran_Detail = new DataRelation("myDataRelation", TransactionColumns, DetailColumns, true); |
One last thing before I wrap it up. I left out the more complex constructor which is somewhat more complex because while it's simple to implement, understanding it conceptually isn't. I think you need to fully understand DataRelations before you go diving into scenarios where you are nesting relations. A common scenario where you'd use such a technique could include a case where TransDetails had a child table called PayementDetails. When you iterate through Transactions, you might have a grid which would display the TransDetails but you might have another grid or tab that displayed the payment details of each of the Transaction Details. In properly normalized data, this is something you'll probably run across, but you can still accomplish what I just mentioned using standard relations. I've used these on a few occassions and think they are pretty straighforward, but given the fact that many people have trouble getting composite keys to work, I opted to leave this out of the discussion. I've got a few code examples and if you are interested in seeing how this works, email me Here and I'll send them to you.
Conclusion:
There are many reasons to use DataRelations and very few not to, in fact I can think of only one. Here are the big ones that I can think of:
1) They enforce integrity! Since this is a fundamental goal of any true Relational Database, this alone is enough. And while you think you can roll out your own logic, do yourself a favor and use these. If you really think you can do a better job, you can still include your own validation code...but don't ignore these. They'll cascade your changes, they'll propogate updates (for instance, if you get an Identity value back from your database and update the parent column, the children will be updated too), they are simple to use and they are powerful
2) They make otherwise difficult navigation much easier. In fact, if you don't use them, you'll have to constantly requery your database to the child values, and this causes unnecessary strain on the database, network and client machine. In addition, if you requery your database each time a change occurs, you are limiting your applications ability to operate in disconnected scenarios which has no upside
3) They are faster and more efficient than other approaches. Since you have reduced network traffic and fire fewer queries, your app will repond quicker
4) They can handle just about anything you can throw at them. The DataColumn is a very powerful object and combined with a DataRelation, you can support composite keys, Auto Increment fields, and just about anything else you might throw at it.
The only downside I can see is that if you aren't using a BindingContext or a BindingManager, navigation isn't automatic as far as filtering goes. However, this is a very small issue overall, and when compared to the benefits you gain, it's inconsequential.
Great post!
ReplyDeleteRegards
Very nice and detailed post! Thanks! :)
ReplyDeleteThanks for the valuable lesson! It works like I want now :)
ReplyDelete