I’m often asked about whether or not we have “duplicate” statistics in our environment. The thought is that, perhaps, duplicate statistics would cause a delay in execution plan compilation or even a different plan. I’m not sure that it’d be a big deal to have them (please correct me if I’m wrong), but I still decided to work up a little script that will help identify them, at least.
with x as ( select stats_generation_number = row_number() over (partition by base.object_id, base.stat_cols, base.filter_definition order by base.is_index desc), * from ( select o.object_id, s.stats_id, schema_name = sh.name, object_name = o.name, stat_name = s.name, is_index = case when i.name is null then 0 else 1 end, s.has_filter, s.filter_definition, stat_cols = stuff( ( select ',' + index_col(db_name(db_id()) + '.' + quotename(sh.name) + '.' + quotename(o.name),sc.stats_id,sc.stats_column_id) from sys.stats_columns sc where s.object_id = sc.object_id and s.stats_id = sc.stats_id for xml path('') ),1,1,'' --eliminates NULL column names which (in my environment) are reducible to partition key injection ) from sys.stats s join sys.objects o on s.object_id = o.object_id join sys.schemas sh on o.schema_id = sh.schema_id left join sys.indexes i on s.object_id = i.object_id and s.name = i.name where sh.name != 'sys' ) base ) select * from x where x.stats_generation_number > 1 and x.is_index = 0 order by object_id