30 Tips for Database Optimization
1. Optimize queries to avoid full table scans. Prioritize creating indexes on columns involved in WHERE and ORDER BY clauses.
2. Avoid checking for NULL values in the WHERE clause, as this will cause the engine to bypass indexes and perform a full table scan. For example:
select id from t where num is null
Avoid using NULL in the database whenever possible. Use NOT NULL to fill columns. Fields like remarks, descriptions, or comments can be set to NULL, but for others, it’s better to avoid NULL.
Note: NULL may still occupy space. For instance, a CHAR(100) column reserves 100 characters whether or not it contains a value (including NULL). On the other hand, variable-length fields like VARCHAR do not allocate space for NULL.
Instead, you can set a default value of 0 for the num column and query as follows:
select id from t where num = 0
3. Avoid using != or <> operators in the WHERE clause, as these also bypass indexes and cause full table scans.
4. Avoid using OR in the WHERE clause when one field is indexed and another is not. This will bypass the index, causing a full table scan. For example:
select id from t where num=10 or Name = 'admin'
Instead, use:
select id from t where num = 10 union allselect id from t where Name = 'admin'
5. Use IN and NOT IN cautiously to avoid full table scans. For instance:
select id from t where num in(1,2,3)
For consecutive values, use BETWEEN instead:
select id from t where num between 1 and 3
Often, EXISTS can be a better alternative to IN:
select num from a where num in(select num from b)
Replace with:
select num from a where exists(select 1 from b where num=a.num)
6. Queries like the following will also trigger full table scans:
select id from t where name like ‘%abc%
Consider using full-text search for better performance.
7. Using parameters in the WHERE clause can cause full table scans because SQL parses local variables only at runtime. The optimizer cannot delay the access plan's selection, as it happens during compilation. For example:
select id from t where num = @num
Force the query to use an index instead:
select id from t with(index(index_name)) where num = @num
Avoid using expressions in the WHERE clause, which bypass indexes:
select id from t where num/2 = 100
Rewrite as:
select id from t wherenum = 100*2
Avoid using functions in the WHERE clause, which also bypass indexes:
select id from t where substring(name,1,3) = ’abc’ select id from t where datediff(day,createdate,’2005-11-30′) = 0 -–‘2005-11-30’
Rewrite as:
select id from t where name like 'abc%' select id from t where createdate >= '2005-11-30' and createdate < '2005-12-1'
10. Avoid placing functions, arithmetic operations, or other expressions on the left side of the = operator in the WHERE clause.
11. When using a composite index, ensure the query includes the first column of the index to make use of it. Try to keep field order consistent with the index.
12. Avoid writing meaningless queries, such as:
select col1,col2 into #t from t where 1=0
Use:
create table #t(…)
13. When updating a single or few columns, avoid updating all fields, as this increases performance overhead and generates more logs.
14. For JOINs involving large datasets, filter using pagination before performing the JOIN to reduce logical reads and improve performance.
15. Avoid unconditional counts like:
select count(*) from table
This triggers a full table scan and has no business significance.
16. Avoid excessive indexes. While they improve SELECT efficiency, they decrease INSERT and UPDATE efficiency because of index rebuilding. Limit a table to no more than six indexes.
17. Minimize updates to clustered index columns. Changes to these columns require reordering the entire table, consuming significant resources.
18. Use numeric fields wherever possible instead of character types to improve query performance and reduce storage overhead.
19. Prefer VARCHAR/NVARCHAR over CHAR/NCHAR for variable-length data, as it saves storage space and improves query performance.
20. Avoid SELECT * and specify column names explicitly.
21. Use table variables instead of temporary tables when possible. However, for large datasets, remember table variables have limited indexing.
22. Reduce the frequent creation and deletion of temporary tables to conserve system resources. Use them appropriately for scenarios like repeated references to large or commonly accessed datasets.
23. For large data inserts into temporary tables, use SELECT INTO instead of CREATE TABLE to reduce log usage and improve speed.
24. Explicitly drop temporary tables at the end of stored procedures to avoid prolonged system table locks.
25. Avoid cursors due to their poor performance. If a cursor processes over 10,000 rows, consider rewriting it.
26. Seek set-based solutions instead of cursor or temporary table methods for better efficiency.
27. Cursors can be used for small datasets, especially with FAST_FORWARD. However, set-based methods are generally faster.
28. Use SET NOCOUNT ON at the beginning and SET NOCOUNT OFF at the end of stored procedures and triggers to prevent unnecessary messages.
29. Minimize large transactions to enhance system concurrency.
30. Avoid returning large datasets to clients. Evaluate if such requests are reasonable.