Sunday, March 11, 2012

Create only unique indexes

Hello.
SQL Server 2000, Windows 2000 Server, latest SPs
I have been told by a knowledgeable friend that SQL Server works best if
you ONLY use unique indexes. To create an index on a non-indexed field,
he recommends creating a composite index on the indexed field together
with the table's primary key, eg:
CREATE UNIQUE INDEX idx1 ON t.field1, t.id
Could anyone please explain:
i) if this is indeed best practice
ii) if so, why this works, and
iii) if so, why SQL Server does not do this by default?
Thanks in advance for your help,
MikeMike,
The more selective a column's values are the more likely the optimizer is to
use the index on the column. That being said a column with only unique
values will have the highest selectivity and will be a prime candidate for
an index. However in the real world data is very often required to be
searched that is not unique. Even then if the column selectivity is high
enough, the optimizer is still likely to use the index if it exists.
By default a PRIMARY KEY is a clustered index. When you create a
nonclustered index the pointer to the data in the index is the clustered
key. So there is no need to add the index column and the PK column in the
index.
HTH
Jerry
"Mike Chamberlain" <none@.hotmail.com> wrote in message
news:ehuM8850FHA.3524@.tk2msftngp13.phx.gbl...
> Hello.
> SQL Server 2000, Windows 2000 Server, latest SPs
> I have been told by a knowledgeable friend that SQL Server works best if
> you ONLY use unique indexes. To create an index on a non-indexed field,
> he recommends creating a composite index on the indexed field together
> with the table's primary key, eg:
> CREATE UNIQUE INDEX idx1 ON t.field1, t.id
> Could anyone please explain:
> i) if this is indeed best practice
> ii) if so, why this works, and
> iii) if so, why SQL Server does not do this by default?
> Thanks in advance for your help,
> Mike|||Mike Chamberlain wrote:
> CREATE UNIQUE INDEX idx1 ON t.field1, t.id
> Could anyone please explain:
> i) if this is indeed best practice?
No. Here's few tidbits. Others may add more:
1- Indexes should be driven by DRI constraints (PK, FK, alternate keys)
and by performance needs. If you're dealing with a PK or alternate key,
it's going to be unique as a matter of design. If it's a FK, it's most
likely a non-unique index. If a column or set of columns should be
unique because the business says they are, then a unique index should be
used. If the business says they're not unique, then they're not.
2- A table with a clustered index uses the clustered index key as the
pointer in all non-clustered indexes. Using your co-workers recommended
scenario in this case you would have a unique, clustered index on ID,
and a unique, non-clustered index on COL1 + ID. Underneath, the index
will really have ID + COL1 + ID, which just adds overhead.
3- A clustered index key is always stored internally as a unique value.
If you create a non-unique, clustered index, each column value that has
repeating values has a unique identifier tacked onto the value of each
to make it unique internally. Adding on an UNIQUE INT value as
suggested, could decrease index size if there are many repeating values.
In this case, it could help a little in size, but I don't think it's
worth it.
4- On a table with many non-unique indexes, you would have the ID column
added onto each, adding an additional 4 bytes per index key. Using a
clustered index on the ID would give you the same effect without the
additional overhead.
5- SQL Server can use non-unique indexes as well as unique ones. If your
index statistics are kept up to date, then the query optimizer should
select the proper index.
David Gugick
Quest Software
www.imceda.com
www.quest.com|||Just a minor point, if you don't mind, David (and some additional points for
Mike):

> 2- A table with a clustered index uses the clustered index key as the poin
ter in all non-clustered
> indexes. Using your co-workers recommended scenario in this case you would
have a unique,
> clustered index on ID, and a unique, non-clustered index on COL1 + ID. Und
erneath, the index will
> really have ID + COL1 + ID, which just adds overhead.
SQL Server will not store the clustering key twice if you name the clusterin
g key in the nc index.
In other words, say you have a unique clustered index on the "ID" column and
want to create a nc
index on the "Data" column, then both below will do the exact same thing:
(data, id)
(data)
For both above, the index will be on (data, id)
Below is a different thing, though:
(id, data)
But above would not be what is desired, as the reason to create the nc index
is probably for SQL
Server to be able to SEEK through the index on the "data" column alone. Not
having that column as
the first column in the index makes it useless for SEEK over the column.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"David Gugick" <david.gugick-nospam@.quest.com> wrote in message
news:%23D$QNO60FHA.3720@.TK2MSFTNGP14.phx.gbl...
> Mike Chamberlain wrote:
> No. Here's few tidbits. Others may add more:
> 1- Indexes should be driven by DRI constraints (PK, FK, alternate keys) an
d by performance needs.
> If you're dealing with a PK or alternate key, it's going to be unique as a
matter of design. If
> it's a FK, it's most likely a non-unique index. If a column or set of colu
mns should be unique
> because the business says they are, then a unique index should be used. If
the business says
> they're not unique, then they're not.
> 2- A table with a clustered index uses the clustered index key as the poin
ter in all non-clustered
> indexes. Using your co-workers recommended scenario in this case you would
have a unique,
> clustered index on ID, and a unique, non-clustered index on COL1 + ID. Und
erneath, the index will
> really have ID + COL1 + ID, which just adds overhead.
> 3- A clustered index key is always stored internally as a unique value. If
you create a
> non-unique, clustered index, each column value that has repeating values h
as a unique identifier
> tacked onto the value of each to make it unique internally. Adding on an U
NIQUE INT value as
> suggested, could decrease index size if there are many repeating values. I
n this case, it could
> help a little in size, but I don't think it's worth it.
> 4- On a table with many non-unique indexes, you would have the ID column a
dded onto each, adding
> an additional 4 bytes per index key. Using a clustered index on the ID wou
ld give you the same
> effect without the additional overhead.
> 5- SQL Server can use non-unique indexes as well as unique ones. If your i
ndex statistics are kept
> up to date, then the query optimizer should select the proper index.
>
> --
> David Gugick
> Quest Software
> www.imceda.com
> www.quest.com|||Tibor Karaszi wrote:
> Just a minor point, if you don't mind, David (and some additional
> points for Mike):
> SQL Server will not store the clustering key twice if you name the
> clustering key in the nc index. In other words, say you have a unique
> clustered index on the "ID" column and want to create a nc index on
> the "Data" column, then both below will do the exact same thing:
> (data, id)
> (data)
>
Tibor,
I just did a quick test. I created a clustered index on COL1 + COL2. I
then created two other indexes: A unique, non-clustered on COL3 + COL1 +
COL2 and a non-clustered on COL3.
In sysindexes, the keycnt column shows as 5 for the first unique,
non-clustered index and 3 for the non-unique index. The keys column is
larger for the 5 keycnt column. However, the reserved pages are similar,
supporting your theory.
David Gugick
Quest Software
www.imceda.com
www.quest.com|||David,
DBCC PAGE to the rescue :-). I created a table same as your description and
looked at the actual
index pages:
A clustered index on COL1 + COL2. A unique, non-clustered on COL3 + COL1 + C
OL2 and a non-clustered
on COL3. My test show that a row entry for the index page for the two non-cl
ustered indexes is
identical:
use tempdb
GO
drop table t
GO
create table t(c1 varchar(1), c2 varchar(1), c3 varchar(1), c4 varchar(1), c
5 varchar(1))
insert into t values ('a', 'b', 'c', 'd', 'e')
create clustered index x1 on t(c1, c2)
create unique nonclustered index x2 on t(c3, c1, c2)
create nonclustered index x3 on t(c3)
SELECT keycnt, * FROM sysindexes where id = object_id('t')
DBCC IND(tempdb, t, -1)
--Take address for indid 2 and 3, page type 2 and use below
DBCC TRACEON(3604)
DBCC PAGE(tempdb, 1, 15, 1)
DBCC PAGE(tempdb, 1, 31, 1)
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"David Gugick" <david.gugick-nospam@.quest.com> wrote in message
news:%23HdkwXB1FHA.1256@.TK2MSFTNGP09.phx.gbl...
> Tibor Karaszi wrote:
> Tibor,
> I just did a quick test. I created a clustered index on COL1 + COL2. I the
n created two other
> indexes: A unique, non-clustered on COL3 + COL1 + COL2 and a non-clustered
on COL3.
> In sysindexes, the keycnt column shows as 5 for the first unique, non-clus
tered index and 3 for
> the non-unique index. The keys column is larger for the 5 keycnt column. H
owever, the reserved
> pages are similar, supporting your theory.
>
> --
> David Gugick
> Quest Software
> www.imceda.com
> www.quest.com

No comments:

Post a Comment