register log on
Knowledge base
 

Best practices for SQLServer database design

KB Artcile # 2627
This article summarizes the best practices accumulated by our development lab team while working on numerous database-centric projects. 
1. DO use a primary key
Each table must have a primary key. Even it is a simple lookup table, create a dedicated column which carries the meaning of the primary key. The primary key should have no business meaning. Use IDENTITY constraint against the primary key.
2. DO NOT use composite primary key
Composite primary keys usually have business meaning and thus should be avoided.  Also queries are simpler when using just one column as a primary key.
3. DO use custom time stamps
Create dedicated columns CreationDateStamp and UpdateDateStamp to maintain time stamps. Conventional datetime data type for these columns is easier to use. Use triggers to maintain these time stamps, for example:

CREATE TRIGGER tu_Customer ON Customer FOR UPDATE AS
BEGIN
UPDATE Customer
SET
    UpdateDateStamp = GETDATE()
FROM
    Customer
JOIN inserted ON inserted.CustomerID = Customer.CustomerID
RETURN
END

CREATE TRIGGER ti_Customer ON Customer FOR INSERT AS
BEGIN
UPDATE Customer
SET
    CreationDateStamp = GETDATE(),
    UpdateDateStamp   = GETDATE()
FROM
    Customer
JOIN inserted ON inserted.CustomerID = Customer.CustomerID
RETURN
END

4. DO use clustered index
Clustered index is especially useful in queries with a range condition, such as a date range - use clustered index against datetime column in this case.

Note that update performance against tables with clustered index is better, when this index is against a column with values incrementing on each INSERT, such as primary key with IDENTITY constraint.

5. DO NOT use triggers to implement complex business logic
Triggers are not on the database design "surface". Use stored procedures to implement business logic and call stored procedures from triggers if necessary.

6. DO NOT overuse transactions
Use transactions only if absolutely necessary. Although transactions help to maintain data integrity, they also significantly extend lock retention and could cause significant database slow down when utilization grows.

7. DO use normalized design
Normalized database design should be a default consideration unless scalability considerations require the opposite.  

8. DO use de-normalized design when necessary
De-normalization is not an evil. Scalability considerations are a frequent reason for use of de-normalization. By dropping declarative referential integrity constraints and splitting databases into smaller ones possibly running on different servers, we may achieve a scalable distributed design which is capable of meeting growing business requirements.

9. DO use a reduced set of data types
Although SQLServer offers a rich set of data types, which is growing with each version, we recommend staying with a reduced set of data types for broader compatibility with programming tools, middleware components and other databases, either from previous generations or even other vendors.

We recommend the following data types: varchar, int, numeric, datetime, bit, text, image.

10. XML columns
Starting with SQLServer 2005, Microsoft introduced a column with the type XML. Although there are benefits to storing XML-structured data into one column, effectively dynamically expanding table structure, there are performance and use complexity penalties when using data encapsulated into XML, especially with tables with large numbers of rows. We recommend that you exercise caution when using this feature.

11. DO break tables into multiple tables
When part of a table is mostly static and is used in frequent SELECT queries, while another part of the table has columns with high frequency of UPDATE queries, we recommend that, for scalability reasons, such tables are split into two or more tables with one-to-one relations.

12. DO consider returning XML from stored procedures
Returning XML from stored procedure is an efficient way to meet change requests while preserving stored procedure API. For example, when additional data is needed from the stored procedure, this would be matter of adding additional elements in the returned XML while preserving the structure and data types of stored procedure output parameters. That said, avoid using XML with complex structures and large data sets.

13. DO consider XML as input to stored procedures
Sending XML to stored procedure as input parameter is an efficient way to pass data sets (lists of values) in a most general way.  SQLServer is capable of easily parsing and converting XML with simple hierarchical structure into temporary tables and then using them in queries. Avoid using XML with complex structure and large data sets.

The following example demonstrates how to parse XML within stored procedure.  In this example, we assume that XML is passed via input parameter @mXMLText (declared with data type "text").  XML is converted to a temporary table #Temp.

DECLARE @hDoc int

CREATE TABLE #Temp (
    ProductName    varchar(50)    null)

EXEC sp_xml_preparedocument @hDoc OUTPUT, @mXMLText
IF @@ERROR <> 0 RETURN

INSERT INTO #Temp (ProductName)
SELECT *
FROM OPENXML(@hDoc, N'/root/row', 2)
WITH (ProductName varchar(50))
IF @@ERROR <> 0 RETURN

EXEC sp_xml_removedocument @hDoc

XML example which can be parsed by this code is:

<root>
  <row>
    <ProductName>Car</ProductName>
  </row>
  <row>
    <ProductName>Boat</ProductName>
  </row>
  <row>
    <ProductName>Bicycle</ProductName>
  </row>
</root> 
 
14. DO schedule sp_updatestats
Schedule execution of system stored procedure sp_updatestats to run at least once a day. This will help SQL query optimizer take into account changes in data allocation and indexes, choose better execution plan and ultimately execute queries faster. 

15. DO NOT use "triggers which fire other triggers" feature
SQLServer offers what at first seems like a convenient feature by enabling "triggers which fire other triggers". However, in practice this feature may cause data updates against multiple tables, which will be hard to conceptualize and take into account during a continuous design process. Instead, if cascading data modifications are required by business logic, create a dedicated stored procedure where all data manipulations are explicitly expressed by SQL code with proper error handling.

16. DO add feature optimization code at the beginning of each stored procedure
SQLServer provides commands which turn on/off certain SQL engine features. To enhance performance, use the following  command at the beginning of each store procedure:
SET NOCOUNT ON
This prevents extra round trips of information flow between the calling program and the server, thus reducing network traffic per stored procedure call.  Do not use this technique f the calling program is actually relying on data provided by the SET NOCOUNT OFF.

17. DO NOT use SQL cursors
SQL cursors require significant resources to maintain them. Therefore, use alternative techniques, for example, loops against temporary tables in stored procedures. A loop may be implemented by using a column with a unique numeric value in a temporary table (or create a new column in a temporary table with IDENTITY constraint), which is used in the loop as a criterion for selecting the next record on each iteration. Below is a code sample, which uses this technique:

-- select the first customer identifier
SELECT
@nCurrentCustomerID = MAX(CustomerID)
FROM
#Temp

-- start the loop
WHILE NOT ISNULL(@nCurrentCustomerID, 0) = 0
BEGIN
-- do something with the row which has customer identifier the same as @nCurrentCustomerID
--
--
-- select the next customer identifier
SELECT
@nCurrentCustomerID = MAX(CustomerID)
FROM
#Temp
WHERE
CustomerID > @nCurrentCustomerID

END
-- end of the loop

18. DO use optimizer hint WITH (NOLOCK)
Use WITH (NOLOCK) optimizer hint in SELECT statements (including in JOIN clauses) to minimize data locks, e.g.:

SELECT
Customer.CustomerID, Customer.CustomerName, CustomerOrder.OrderID
FROM
Customer WITH (NOLOCK)
JOIN
CustomerOrder (WITH NOLOCK) ON CustomerOrder.CustomerID = Customer.CustomerID
Do not use this technique, if read of uncommitted data is undesirable.

19. DO include columns with SELECT statement
Use only columns you need in SQL SELECT clauses, do not take shortcuts by using an asterisk (*) to return all columns. This will both reduce the network traffic and help SQL query optimizer use a better execution plan.

20. DO use table views
Use table views, do not execute SQL queries from program code to run against physical tables.

21. DO simplify (but not sacrifice) SQLServer access control setup for web applications
In the days of client-server development, end users connected to databases directly. DBA's used to set up dedicated logins for each user, map them to corresponding database users and create an access control map. In web applications, SQLServer is no longer playing the role of end-user authentication authority, this role is offloaded to the user facing web identity management  and single sign-on systems.

Therefore, consider minimizing the number of SQLServer logins, as the same login may be used for thousands and millions of web users. It still makes sense to create separate logins for different web applications.

Also, consider simplifying access control maps - the entire application-specific database may be accessible to logins used by web applications, as web applications are in charge of what a user can and cannot do. An exception to this rule might be the case of a large database with hundreds or and thousands of database objects, which is interfaced with multiple applications developed by different teams - in this case, use of tightly managed access control maps should be maintained.