Thursday, March 22, 2012

Create Statistics on NonIndexed Column

Hello. Could someone explain to me why it might be desirable to create a
statistic on a non-indexed column?
Thank you."Amos Soma" <amos_j_soma@.yahoo.com> wrote in message
news:%23X4AxZWBGHA.3604@.TK2MSFTNGP09.phx.gbl...
> Hello. Could someone explain to me why it might be desirable to create a
> statistic on a non-indexed column?
> Thank you.
>
The more information the query optimizer has, the better decisions it can
make.
Having statistics on a column that is not part of an index may help it make
better decisions about join strategies, table-scans versus index
seeks/scans, bookmark lookups and so forth.
Rick Sawtell
MCT, MCSD, MCDBA|||Here are some examples to illustrate Rick's statement. Suppose you have
the following table:
CREATE TABLE Persons
(PersonID int not null PRIMARY KEY CLUSTERED
,LastName varchar(100) not null
,FirstName varchar(100) not null
,Address varchar(100) null
-- many other columns, very wide table
,CONSTRAINT UQ_Persons_Name UNIQUE (LastName, FirstName)
)
Now, the DDL above will create a nonclustered index on (LastName,
FirstName). Now suppose you have the following query:
SELECT *
FROM Persons
WHERE FirstName='Amos'
With the default table settings, SQL Server will auto create statistics
of column FirstName.
With these statistics, the optimizer can estimate how many rows there
are with FirstName 'Amos'. If this is a small percentage, then the
strategy would be to scan the nonclustered index and retrieve the
matching rows from the table (bookmark lookups). If it is a large
percentage, the clustered index will be scanned (basically a table
scan).
Without these statistics, SQL Server will optimizer for the worst case
situation, which in this case means scanning the table (clustered index
scan).
Another example. Suppose you also have this table:
-- just an example. Relation might not make sense
CREATE TABLE Hobbies
(HobbyID int not null PRIMARY KEY CLUSTERED
,Hobby varchar(100) not null UNIQUE
,Comments varchar(100) null
,PersonID int not null REFERENCES Persons
)
-- index on the foreign key
CREATE INDEX IX_Hobbies_PersonsID ON Hobbies(PersonID)
with this query:
SELECT Hobby
FROM Hobbies
INNER JOIN Persons
ON Persons.PersonID = Hobbies.PersonID
WHERE Address = 'Main Street'
AND Comments = 'Todo'
In this case statistics of the columns Persons.Address and
Hobbies.Comments can determine the access path for the query. If column
Comments has a high occurrence of 'Todo' and Address a low occurrence of
'Main Street', then the optimizer might choose to access (scan) the
Persons table first, and then lookup matching rows in Hobbies (and
filter out remaining rows). With different statistics the optimizer
might access (scan) the Hobbies table and then lookup the matching rows
in Persons (and filter out the remaining unwanted rows).
HTH,
Gert-Jan
Amos Soma wrote:
> Hello. Could someone explain to me why it might be desirable to create a
> statistic on a non-indexed column?
> Thank you.

No comments:

Post a Comment