Skewed Data: Analyzing the Windows

Previously, we looked at analyzing a table to see which columns in that table may contain skewed data.

That was a good start, but now it’s time to look at the statistics that exist on that column to see if we can identify potential candidates for filtered statistics, based on the “windows” between histogram steps.

Much of the logic is the same in this script, except it counts every value in the column. Additionally, it will look at all statistics that exist on that column. If there are no statistics on that column, then we can’t do a histogram step window analysis anyway. The same general principles exist for this analysis as well. We’re looking at the test statistic (zG1) to determine how skewed the data may actually be.

Just like with the table analysis, I worked around data type issues by using dense_rank() over the keys. In the histogram dump table, I created a column called alt_key which I then update based on matching the count table key to the histogram step key.

set @sql = 'dbcc show_statistics('''+quotename(@SchemaName)+'.'+quotename(@TableName)+''','''+@StatName+''') with histogram,no_infomsgs;'
        print @sql;     
        insert into tempdb.dbo.histo (
                        range_hi_key, 
                        range_rows , 
                        eq_rows , 
                        distinct_range_rows ,
                        avg_range_rows )
        execute sp_executesql @sql
set @sql = 
        'update h
            set h.actual_eq_rows = c.f,
                h.alt_key = c.x
        from tempdb.dbo.histo h
        join '+(@TallyTable)+' c 
            on h.range_hi_key = c.[key]';

        exec sp_executesql @sql;

What this lets me do is pull the analysis results later, without having to muck with sorting on different data types:

select an.*,confirm_query = 'select * from ' + @TallyTable + ' where ' +
case when cols.last_alt_key is not null then ' where x >= ' + cast(cols.last_alt_key as nvarchar(255)) + ' and ' else
    ' x <=' + cast(cols.alt_key as nvarchar(255))
    end  + ' order by [key]'
from tempdb.dbo.histo an
join (
        select 
            stat_name,
            last_range_hi_key = lag(range_hi_key,1,null) over (partition by stat_name order by range_hi_key), 
            range_hi_key,
            last_alt_key = lag(alt_key,1,null) over (partition by stat_name order by range_hi_key), 
            alt_key
        from tempdb.dbo.histo
    ) cols
on an.stat_name = cols.stat_name
    and an.range_hi_key = cols.range_hi_key
where an.actual_distinct_range_rows >= 100
order by abs(zg1) desc;

With the above analysis available to us, we’d run the text in confirm_query to examine the window (each histogram step is included) to sanity check the analysis and ensure that the window does present with skew and has enough distinct values to make a filtered statistic worthwhile. Please note that all of the normal considerations with regard to filtered statistics apply. They don’t update with their filtered threshold is met, but only when the threshold for the entire table is met, and only then if the filtered statistics is loaded for plan (re)compilation. They may add more time to any maintenance task. They may never be used. etc. etc. etc.

Stats Skew Analysis Script

Tagged ,

Skewed Data: Finding The Columns

While in the process of looking for a job that led to the position that I currently hold, I interviewed for a job at a recognizably-named company that was struggling to keep up with their “statistics.” I have a confession. I had no idea what “statistics” were in the SQL Server world. I wasn’t offered that job, but in the interim I did a lot of reading and research on the SQL Server notion of “statistics.” Luckily, I was offered a position at another company a few months later. This company sent me to my very first PASS Summit (2013) in Charlotte, NC where I sat in on Kimberly Tripp’s “Skewed Data, Poor Cardinality Estimates, and Plans Gone Bad” session 1.
Continue reading

Tagged

Finding Value in Incremental Statistics, Pt. 2

Previously, I did a little investigation into incremental statistics and what benefit they can provide. The goal was to find the intersection of update timings for incremental and non-incremental statistics, with the hope of being able to demonstrate a reclamation of statistics maintenance times – and perhaps higher sample rates. Summary: we can, and auto_stats is a beneficiary. However, is this a solution in search of a problem? If you’ve been bumping into statistics maintenance issues, you’re probably already familiar with more of the hands-on intervention available (my personal favorite is Ola Hallengren’s solution).
Continue reading

Finding Value in Incremental Statistics Pt. 1

I originally created this post and observed some rather embarrassing sloppiness and miscalculations. I’ve simply decided to pull the post and rework it. This is the result of those labors.

Recently, I answered a question on dba.stackexchange related to an interesting occurrence with incremental statistics. This led to an good conversation about the efficacy (and utility) of incremental statistics. The primary focus was on Erin Stellato’s confirmation that partition-level stats are not used by the Query Optimizer, thereby identifying a rather sizable gap in the utility of incremental statistics. The only possible benefit I could champion for incremental statistics was that it might allow us to sample at consistently higher rates since the entire table would not need to be sampled – just the ones that presented a compelling enough reason for update (modification counter being a primary focus). Naturally, we’d incur the penalty of the higher sample rate at the initial sampling, but ongoing maintenance would be able to support the higher sample rates because we would only have to sample at the partition level. In this scenario, there are two measures to consider:
Continue reading

Duplicate Statistics Finder

I’m often asked about whether or not we have “duplicate” statistics in our environment. The thought is that, perhaps, duplicate statistics would cause a delay in execution plan compilation or even a different plan. I’m not sure that it’d be a big deal to have them (please correct me if I’m wrong), but I still decided to work up a little script that will help identify them, at least.
Continue reading

The SQL Server 2014 DDL Bulldozer Part 1.1: Implicit Insert

Previously I posted my experimentation with wait_at_low_priority. Specifically, I investigated what would happen when index rebuild was executed while another session was inserting data.

Next we want to see what happens when the index rebuild (with wait_at_low_priority abort blockers specified) is running and the insert is subsequently executed.

Here’s the setup (note that I’ve pre-populated tempdb.dbo.numbers with 15 million rows):

use master;
go

alter event session ddl_bulldozer on server
state = start;

go

use tempdb;
go

alter index ix_tmp_numbers_num on dbo.numbers
rebuild with (
data_compression = page,
online = on ( wait_at_low_priority ( max_duration = 0 minutes,
abort_after_wait = blockers ))
);
go

use master;
go

alter event session ddl_bulldozer on server
state = stop;

go

drop table #ddl_bulldozer
go
select object_name, cast(event_data as xml) event_data
into #ddl_bulldozer
from sys.fn_xe_file_target_read_file('ddl_bulldozer*',null,null,null) xf

select
x.d.value('@timestamp','datetime2(3)') event_timestamp,
db.object_name,
x.d.value('(action[@name="session_id"]/value)[1]','int') session_id,
x.d.value('(data[@name="statement"]/value)[1]','nvarchar(max)') sql_statement,
x.d.value('(data[@name="state"]/text)[1]','nvarchar(max)') sql_state,
x.d.value('(data[@name="duration"]/value)[1]','bigint') duration,
x.d.value('(data[@name="type"]/text)[1]','sysname') ddl_type,
x.d.value('(data[@name="abort_after_wait"]/text)[1]','sysname') abort_after_wait,
x.d.value('(data[@name="transaction_state"]/text)[1]','sysname') transaction_state,
x.d.value('(data[@name="killed_process_id"]/value)[1]','int') killed_process_id,
cast(substring(
x.d.value('(action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'),
1, 36) as uniqueidentifier) as activity_id,
cast(substring(
x.d.value('(action[@name="attach_activity_id"]/value)[1]', 'varchar(50) '),
38, 10) as int) as event_sequence,
cast(substring(
x.d.value('(action[@name="attach_activity_id_xfer"]/value)[1]', 'varchar(50)'),
1, 36) as uniqueidentifier) as activity_id_xfer
--,db.event_data
from #ddl_bulldozer db
cross apply event_data.nodes('//event') x(d)
where db.object_name not like 'lock%'
order by event_timestamp,event_sequence

On my test instance, the rebuild takes about 28 seconds to complete, so during that time, I switch to another window and run the insert session:

/*
kick off an implicit transaction
(with code shamelessly lifted from Kendra Little at:
SQL Server Table Partitioning Tutorial: Videos and Scripts
) */ use tempdb; go ;WITH Pass0 AS ( SELECT 1 AS C UNION ALL SELECT 1), Pass1 AS ( SELECT 1 AS C FROM Pass0 AS A , Pass0 AS B), Pass2 AS ( SELECT 1 AS C FROM Pass1 AS A , Pass1 AS B), Pass3 AS ( SELECT 1 AS C FROM Pass2 AS A , Pass2 AS B), Pass4 AS ( SELECT 1 AS C FROM Pass3 AS A , Pass3 AS B), Pass5 AS ( SELECT 1 AS C FROM Pass4 AS A , Pass4 AS B), tally AS ( SELECT row_number() OVER ( Order BY C ) AS N FROM Pass5 ) insert numbers select * FROM tally WHERE N <= 1000000; GO

Once again, the insert fails (after about 12 seconds) and sends the error message to the client
2.result
but what happened along the way?
xe1
I’ve added a new event (progress_report_online_index_operation) for a bit more visibility into what’s occurring. What we see is the index rebuild starting (line 2, in green) and the ddl_with_wait_at_low_priority event firing because we told the ddl operation to wait at low priority. About four seconds later was when I started the insert (in red). Along the way, we see that the index rebuild is making progress under different activity_id‘s. Because this is an online index rebuild so the page organization is occurring in tempdb and the second (insert) session is allowed to continue its activity, for now. Also note the quick termination of the statement for the sake of a recompile on the insert (recompile cause: “schema changed”).
xe2
At the end of it all, the insert transaction is killed and rolled back, just as in the last test. However, because the index rebuiled was online, the metadata operation to switch the rebuild index structure would be the first block the rebuild encountered. At this point, the insert still has its X lock (as before) on the object. This lock is terminated and the transaction is rolled back, paving the path for the metadata index structure switch.

Tagged

The SQL Server 2014 DDL Bulldozer Part 1: Implicit Insert

Of the fascinating new features in SQL Server 2014, the one that provides me the most catharsis has got to be wait_at_low_priority. If your maintenance window SLA is threatened by “accidental” report cycles spilling into the window, perhaps you’ll want to explore this extension. While there are many blog posts out there to explain how to use it, I wanted to explore it and how it could apply to a variety of situations.
Continue reading

Tagged

Can you repro this bug? Trace Flag 2363 Causes Fatal Exceptions on Windowed Aggregates

While testing SQL Server 2014 Enterprise edition, I wanted to see if I could use trace flag 2363 to gather meaningful information for statistics analysis. Part of this analysis included getting some information on the statistics in question.

With Trace Flag 2363 enabled:


DBCC TRACEON(2363,-1);
GO
select
schema_name = sh.name,
table_name = t.name,
stats_name = s.name,
leading_column = index_col(quotename(sh.name)+'.'+quotename(t.name),s.stats_id,1),
modification_percent = (100.*sp.modification_counter)/sp.unfiltered_rows
,sp.modification_counter
,sp.unfiltered_rows
,sp.rows_sampled
,avg_rows_in_db = avg(1.*sp.unfiltered_rows) over ()
from sys.stats s
join sys.tables t
on s.object_id = t.object_id
join sys.schemas sh
on t.schema_id = sh.schema_id
cross apply sys.dm_db_stats_properties(s.object_id,s.stats_id) sp
where sh.name != 'sys';

This results in a severe error to the client

Msg 0, Level 11, State 0, Line 77
A severe error occurred on the current command. The results, if any, should be discarded.
Msg 0, Level 20, State 0, Line 77
A severe error occurred on the current command. The results, if any, should be discarded.

Parsing the minidump with WinDbg, I arrived at this as the faulting module

sqllang!DRgCId::Print+0x68

I’ve attempted repro with user tables and different aggregates and still get the same issues. If you can reproduce, feel free to vote at the Connect item below.

https://connect.microsoft.com/SQLServer/feedback/details/1327021/trace-flag-2363-causes-fatal-exceptions-on-windowed-aggregates

SQL Saturday 358 Recap (#sqlsat358)

Big thanks to the Albuquerque SQL Saturday team – you always treat the speakers, sponsors, and guests so well. Huge thanks to Meredith Ryan who is an ever-present force of organization and helpfulness.

Continue reading

Tagged ,

SQL Saturday 358: ABQ – The Deuce

This Saturday I’ll be in Albuquerque, NM at SQL Saturday 358. Last year was my first time attending and speaking at a SQL Saturday event, so I’m excited to return to the place that gave me that first chance.

I’ll be talking about managing statistics (again), with an eye toward proactive management. This is just like the presentation I gave at SQL Saturday 331 (Denver) and to the Performance Virtual Chapter, except it’s been completely re-written to enjoy some of the undocumented features in SQL Server 2014 and to focus more on the proactive management. I hope to see you there!

Tagged ,
Follow

Get every new post delivered to your Inbox.

Join 640 other followers