database - SQL Server 2005 Performance: Distinct or full table in WHERE IN statement -
we have 2 tables:
- document: id, title, document_type_id, showon_id
- documenttype: id, name
- relationship: documenttype hasmany documents. (document.document_type_id = documenttype.id)
we wish retrieve list of document types 1 given showon_id.
we see 2 possiblities:
select documenttype.* documenttype documenttype.id in ( select distinct document.document_type_id document showon_id = 42 ); select documenttype.* documenttype documenttype.id in ( select document.document_type_id document showon_id = 42 );
our question is: when , if better use distinct smaller record set versus retrieving whole table , in statement walking table first match. (we guess that's ;-))
is different different databases, there common answer?
or there better way of doing it? (we in .net land)
from point of view should not make difference inside sql server (but knows how implemented).
think of way: return resultset server needs go document table , retrieve document_type_id showon_id = 42. in process of retrieving document_type_ids (e.g. index seeking) puts them hash table. when process has finished hash table contain distinct values anyway. after query execution goes inside document_type table, scans primary key , probes hash table. note depends, e.g. maybe it's more efficient not use hash table, when expected row count document table low compared document_type, in general same query plan query wmasm suggested.
Comments
Post a Comment