Stop me if this sounds familiar.
Business logic within a client application is correctly validating and storing small amounts of data in a table, while keeping related data in another table synchronized.
DBAs often have to enter data into this table by hand (they could use the application but this is more time consuming) without the benefit of validation and forgetting to keep the related data in sync; together they lead to application crashes.
Here's the fictional and simplified example of this issue with some database objects that are experiencing data purity issues.
|Frequency_Code||A table holding different measures of time; Mondays, Minutes, etc.|
|Frequency_Data||Rows with a measure of time and optional time quantity and no database-side validation.|
|Frequency_Info||A single row with a single column to unnecessarily store the next identity value for Frequency_Data for use by the client. It is actively read, used, and updated (only in conjunction with the Frequency_Code table however) so should not be faked such as with a read-only view.|
|Frequency_View||A normalized view of all of the above tables.|
-- Clear leftover tables If Object_Id('Frequency_Data') Is Not Null Drop Table Frequency_Data Go If Object_Id('Frequency_Code') Is Not Null Drop Table Frequency_Code Go If Object_Id('Frequency_Info') Is Not Null Drop Table Frequency_Info Go If Object_Id('Frequency_View') Is Not Null Drop View Frequency_View Go -- Tables Create Table Frequency_Code ( Code Int Not Null, Description Nvarchar(Max) Not Null, Constraint PK_Frequency_Code Primary Key Clustered (Code) ) Go Create Table Frequency_Data ( Frequency_Data_Key Int Identity(1, 1) Not Null, Frequency_Code Int Not Null, Frequency_Amt Int Null, Constraint FK_Frequency_Data_Frequency_Code Foreign Key (Frequency_Code) References Frequency_Code (Code), Constraint PK_Frequency_Data Primary Key Clustered (Frequency_Data_Key) ) Go Create Table Frequency_Info ( Next_Frequency_Data_Key Int Not Null ) Go -- View Create View Frequency_View As Select Next_Frequency_Data_Key, Frequency_Data_Key, Frequency_Amt, Description As Frequency_Description From Frequency_Data FD Join Frequency_Code FC On FD.Frequency_Code = FC.Code Cross Apply Frequency_Info Go -- Populate data Insert Frequency_Code (Code, Description) Select 1, 'Monday' Union Select 2, 'Minutes' Go Insert Frequency_Data (Frequency_Code, Frequency_Amt) Select 1, Null -- Every Monday, ignore the minutes Union Select 1, 1 -- Every Monday, ignore the minutes Union Select 2, 10 -- Every 10 Minutes -- Populate the next identity value Insert Frequency_Info (Next_Frequency_Data_Key) Select Scope_Identity() + 1 -- View the setup Select * From Frequency_View Go
Our goal is to prevent DBAs doing this:
/* -- Don't do this Insert Frequency_Data (Frequency_Code, Frequency_Amt) Select 2, Null -- Every Null Minutes Union Select 2, 0 -- Every 0 Minutes Go */
Your first reaction may be to point out the design flaws in the database structure; then advocate refactoring away the unnecessary second identity column away and adding a check constraint. I think this is technically correct but comes from the viewpoint of developers used to the beginning of a project, and less familiar with the struggles of a DBA long after the project has gone live.
|Refactoring Mindset||Practical Mindset|
|I would do it differently and the way I do it is best practice.||Production databases are messy and imperfect. They may have a good base design but then can grow organically in unintended ways.|
|If it's not best practice it must be poor design.||Perhaps the quality of the design is not of overriding value to the organization; where value rules supreme. Does anybody care that a support table was smacked together to get the job done (under severe time constraints and to cater for edge cases that no longer apply)? Professional work is rarely fit for a textbook.|
|If it's poor design it should be refactored.||Databases should not be refactored just to meet your obsessive-compulsive disorder. Refactoring requires the most manpower, time, and money. This is used to justify shooting down any partial fix at all; besides why pay to fix something when DBAs can just use the client (slowly) or be on the receiving end of endless email reminders about the right way to validate data in that table?|
So resist those urges.
I think the appropriate solution in this case is a single trigger on Frequency_Data which will validate Frequency_Code and where necessary update Frequency_Info. Why?
- We need to write a trigger to keep Next_Frequency_Data_Key in sync, we may as well do validation in there as well instead of creating an additional check constraint.
- It will provide DBAs entering incorrect data manually with an informative error message instead of a generic failure message.
- It can be implemented by the DBA without any other refactoring. (Unless SqlBulkCopy is used, see later).
This article is about the design process, some of the missteps and failures along the way, and discussions of other related tidbits including the differences if we had gone down the check constraint path.
First we need to decide on a BEFORE or AFTER trigger. BEFORE triggers are usually used when you need to do some manipulation on or for the data that is being passed in, AFTER triggers are better for straight validation or adjustments after the data has been inserted. So we'll use an AFTER trigger.
The first part of the trigger needs a validation rule. A row is valid if
Frequency_Code = 1 (Monday) Or
Coalesce(Frequency_Amt, 0) = 0.
Next, trigger best practice is to test whether an action is necessary before undertaking it; that's not so important for such a small trigger but we're going to do it anyway because this is an article about odds and ends. And we do this differently for both validation and key update parts of the trigger.
For example before testing whether any Frequency_Code and Frequency_Amt columns meet our validation rule, and if we start off with the knowledge that our tables already only held clean data, we should only need to test that rule where Frequency_Code has been updated or inserted. This can be done with the trigger Update() function:
If Update(Frequency_Code) Begin -- Do validation End
Determining whether we need to update the key value is not so clear. When will it change? Our first but not last answer to this will be on any insert. Unfortunately there is no equivalent one word test like
In this specific case observe that we are using an identity column. As it happens identity columns cannot be changed on a record. If you attempt this:
Update Frequency_Data Set Frequency_Data_Key = 4 Where Frequency_Data_Key = 3 Go
You'll get this error:
Msg 8102, Level 16, State 1, Line 1 Cannot update identity column 'Frequency_Data_Key'.
So a cute way to test for an insert is to use
If Update(Frequency_Data_Key). If this was not an identity column and was instead just a primary key column, this would not apply, and we could use another method which I'll show later in the article.
Tying all of this information together here is a first attempt at the trigger:
-- A more descriptive name would be appropriate later If Object_Id('TRG_Frequency_Data_After') Is Not Null Drop Trigger Dbo.TRG_Frequency_Data_After Go Create Trigger TRG_Frequency_Data_After On Dbo.Frequency_Data After Insert, Update --============================================= -- Description: -- Modification: -- Date By Description -- 08-Oct-2013 C Konior Created -- ============================================= As Begin -- This is so as not to return extra row counts Set Nocount On Declare @FrequencyCodeMonday Int = 1 -- Insert or update If Update(Frequency_Code) Begin If Exists (Select Top 1 0 From Inserted Where Frequency_Code @FrequencyCodeMonday And Coalesce(Frequency_Amt, 0) = 0) Begin Raiserror('Frequency_Data cannot have a Null or 0 Frequency_Amt when Frequency_Code Monday', 16, 1) End End -- This must be an insert because we can't update an identity column If Update(Frequency_Data_Key) Begin Update Frequency_Info Set Next_Frequency_Data_Key = Coalesce(Max_Frequency_Data_Key, 0) + 1 From Frequency_Info Outer Apply ( Select Max(Frequency_Data_Key) As Max_Frequency_Data_Key From Frequency_Data) Frequency_Data Where Next_Frequency_Data_Key Coalesce(Max_Frequency_Data_Key, 0) + 1 End End Go
Looks kind of good right? Let's test it:
Declare @FrequencyCodeMinutes Int = 2 Insert Frequency_Data (Frequency_Code, Frequency_Amt) Select @FrequencyCodeMinutes, 0 Go Select * From Frequency_View Go
Msg 50000, Level 16, State 1, Procedure TRG_Frequency_Data_After, Line 22 Frequency_Data cannot have a Null or 0 Frequency_Amt when Frequency_Code Monday. (1 row(s) affected)
Something has gone wrong and this is the first bug in the sample trigger. Raising a user-defined error like this is just a message with no other effects on the trigger; it's our responsibility to signal an error by rolling back the transaction and returning when we have finished doing whatever else we want to.
Updating the trigger to do that and re-executing the command results in this message instead:
Msg 50000, Level 16, State 1, Procedure TRG_Frequency_Data_After, Line 22 Frequency_Data cannot have a Null or 0 Frequency_Amt When Frequency_Code Monday. Msg 3609, Level 16, State 1, Line 4 The transaction ended in the trigger. The batch has been aborted.
But there are more bugs to be found. What happens when someone does this?
Delete From Frequency_Data Where Frequency_Data_Key = 3 Go Select * From Frequency_View Go
Note that we haven't done anything to keep our Next_Frequency_Data_Key in sync for deletes. If we alter our trigger definition to cover After Insert, Update, Delete, another bug introduces itself; the
Update(Next_Frequency_Data_Key) test will not return true on deletes and there is no equivalent one word Delete() function.
The solution to this lay in the two tables that are made available to the trigger for processing: Deleted and Inserted. Here are the rules:
- If an update happened, Deleted holds the old rows, Inserted holds the new rows.
- If an insert happened, Deleted will be empty, and Inserted holds the new rows.
- If a delete happened, Deleted holds the old rows, and Inserted will be empty.
- If the Deleted and Inserted tables are both empty, a statement has been run that did not modify any data, and for our purposes can be ignored. There may also be multiple criteria met by a merge, but this will not affect us.
We can now write a fixed procedure that includes the above rollbacks, return, handles deletes, and as an added bonus would also properly work with a non-identity primary key column:
If Object_Id('TRG_Frequency_Data_After') Is Not Null Drop Trigger Dbo.TRG_Frequency_Data_After Go Create Trigger TRG_Frequency_Data_After On Dbo.Frequency_Data After Insert, Update, Delete --============================================= -- Description: -- Modification: -- Date By Description -- 08-Oct-2013 C Konior Created -- ============================================= As Begin -- This is so as not to return extra row counts Set Nocount On Declare @FrequencyCodeMonday Int = 1 -- Insert or update If Update(Frequency_Code) Begin If Exists (Select Top 1 0 From Inserted Where Frequency_Code @FrequencyCodeMonday And Coalesce(Frequency_Amt, 0) = 0) Begin Raiserror('Frequency_Data cannot have a Null or 0 Frequency_Amt when Frequency_Code Monday', 16, 1) Rollback Transaction Return End End -- If the Deleted table is empty, it was an insert -- If the Inserted table is empty, it was a delete If Not Exists (Select Top 1 0 From Deleted) Or Not Exists (Select Top 1 0 From Inserted) Begin Update Frequency_Info Set Next_Frequency_Data_Key = Coalesce(Max_Frequency_Data_Key, 0) + 1 From Frequency_Info Outer Apply ( Select Max(Frequency_Data_Key) As Max_Frequency_Data_Key From Frequency_Data) Frequency_Data Where Next_Frequency_Data_Key Coalesce(Max_Frequency_Data_Key, 0) + 1 End End Go
Testing a delete updates the Next_Frequency_Data_Key column:
Delete From Frequency_Data Where Frequency_Data_Key = 4 Go Select * From Frequency_View Go
Of course any seasoned textbook professional will look at this and realize that the entire
Update(Frequency_Data_Key) test and logic can be replaced with a single check constraint:
Alter Table Frequency_Data Add Constraint CHK_Frequency_Data Check (Frequency_Code = 1 Or Coalesce(Frequency_Amt, 0) 0) Go
And this is true but they are not one-for-one equivalents, there are subtle differences:
Declare @FrequencyCodeMinutes Int = 2 Insert Frequency_Data (Frequency_Code, Frequency_Amt) Select @FrequencyCodeMinutes, Null Print 'Batch Still Going' Go
Msg 547, Level 16, State 0, Line 3 The INSERT statement conflicted with the CHECK constraint "CHK_Frequency_Data". The conflict occurred in database "master", table "dbo.Frequency_Data". The statement has been terminated. Batch Still Going
As you can see, the error message was less descriptive than our own, containing no information about what part of the statement had caused validation to fail. This isn't very useful to the forgetful DBAs that are running the commands manually; who now have to dig into the table definition to possibly work out what happened.
Also the batch continued to run whereas when a trigger with Rollback Transaction will prevent any further execution (in most circumstances, however this can sometimes be overridden by the calling application).
You might wonder what the order of execution is if we have both a trigger and check constraint defined. It goes like this:
- Before Trigger
- Check Constraint
- After Trigger
Though you can also have multiple Before and After triggers which can be run in a different order as specified by specifying first and last triggers.
A few more notes. Do you want to disable or enable the trigger later?
Alter Table Frequency_Data Enable Trigger TRG_Frequency_Data_After Go Alter Table Frequency_Data Disable Trigger TRG_Frequency_Data_After Go
What about the check constraints?
Alter Table Frequency_Data NoCheck Constraint CHK_Frequency_Data Go -- This can be bad Alter Table Frequency_Data Check Constraint CHK_Frequency_Data Go If Exists (Select Top 1 0 From Sys.Check_Constraints Where Is_Not_Trusted = 1 And Name = 'CHK_Frequency_Data') Print 'Not Trusted' Else Print 'Trusted' Go -- This is better but takes longer because it validates the existing data Alter Table Frequency_Data With Check Check Constraint CHK_Frequency_Data Go If Exists (Select Top 1 0 From Sys.Check_Constraints Where Is_Not_Trusted = 1 And Name = 'CHK_Frequency_Data') Print 'Not Trusted' Else Print 'Trusted' Go
Trusted constraints are important because an untrusted constraint will not be used to generate optimized query plans.
One More Thing
DBAs should not create or enable triggers or check constraints on tables that are being used by client software, until either investigating the source code or (better) testing the change still works in the client and not just from the database side.
One great example of this is that .NET applications that use an SqlBulkCopy object will suddenly throw a new exception when encountering a table that now has an unexpected trigger or check constraint:
DatabaseErrorMessage: Bulk copy failed. User does not have ALTER TABLE permission on table 'FREQUENCY_DATA'. ALTER TABLE permission is required on the target table of a bulk copy operation if the table has triggers or check constraints, but 'FIRE_TRIGGERS' or 'CHECK_CONSTRAINTS' bulk hints are not specified as options to the bulk copy command.
That will require additional security being granted (which probably isn't a great idea) or the application to be modified to include the relevant hints before you add the trigger or check constraint. This is one case where it's better to investigate and test first before running in blindly.