Who is afraid of the Data Collector? (Part 2)

By | In Blog, SQL Server | June 25th, 2014

collectorIn a previous blog post we talked about what scares me most. Sharks and avalanches. While they may terrify me, they are still pretty cool… when you know enough about them. This is similar to the dangers of the Data Collector and Management Data Warehouse (MDW). But I set out to learn more about them and I share my insight with you here.

We left off on the “sharks,” or in other words, what happens after you enable and configure the Data Collector.

The Sharks:

Once you enable and configure the Data Collector it seems there is no going back. If you change your mind and want to clean things up there is no easy way to remove the associated SQL Agent jobs and MDW database objects. You’ll note attempts to drop the SQL Agent Jobs associated with the Data Collector Sets result in a “DELETE statement conflicted with the REFERENCE constraint” error.

The best you can do is disable the jobs andor stop the Data Collection Sets. For more details see this Connect post. (Of interest: check out the comments section and note how many well-known members of the SQL community, bloggers and authors, have sounded off on this one).

If you have multiple instances sharing a server, beware the schedules of the data collection upload jobs. By default they run every 15 minutes beginning at 12 a.m. A single upload is not much concern, but if two or more instances are uploading concurrently there can be some conflict resulting in CPU andor memory use spikes on the source server, as well as conflicts in the MDW database.

Back to the MDW and reporting side of things, although in general I like the layout and information available via the Management Data Warehouse Overview, I find the delivery a bit clunky. It’s slow to refresh zooming in and out of time ranges and windows and its slow going back to previous views. Also, the time range is limited to 24 hours in the Server Reports.

Digging deeper into the pre-packaged reports, under the Disk Usage Collection Set report you’ll notice that the Disk Usage History (via the DB name links) gives an error; “MDW Disk Usage for Database Report Error – A data source has not been supplied for the data source DS_TraceEvents”. Per this Connect post, this is not yet fixed. The good news is that this is only an issue if the MDW database is centralized.

Also under the Disk Usage Collection Set report you’ll find the reports by Database and Log (linked via the blue trend lines) are inaccurate. You can read more about it here where I take a closer look at what is going wrong.

Dangerous, but cool:

Ok, so we have established there are a number of things that can go awry when configuring and using Data Collection and the MDW – it’s certainly not perfect. However, we also know the underlying idea is a good one – a quick and easy baseline and performance monitoring solution. Let’s review the pros and cons…

Pro:

  • Free
  • No third party tools
  • Out of the box solution to capture and report on baseline and performance metrics
  • No need to make and maintain roll-you-own perfmon collections and aggregation scripts.

Con:

  • Avalanches
  • Sharks

So, what to do about those avalanches and sharks?

First, consider ignoring the option to centralize the MDW. Instead – similar to a local perfmon baseline – set each Data Collector to upload to a local MDW (preferably a dedicated utilityadmin database which each instance should have IMHO).

It’s a shame to miss out on the convenience of the centralization option but keeping the MDW local mitigates both the disk and network over-use issues as well as some of the reporting troubles. A big shop might have the infrastructure to support a huge central MDW.  For small and mid-size shops, you’re not losing key functionality by not centralizing the MDW, but you may be improving the current local baseline solution (if there is one).

Second, compress the tables in the MDW.

Then , explore the core and snapshot schema objects within the MDW and break free of the inflexibilities of the SSMS MDW reports. There are a number of views and procedures that can be executed outside of the SSMS MDW reports and leveraged into a custom reporting solution.

In fact, it’s feasible to reconsider centralized reporting as a custom solution if you set up some process to collect the local MDW data from each instance using the available procedures. Here is one example of a stored procedure that underlies many of the SSMS MDW reports (you can discover more using SQL Profiler.)

snapshots.rpt_generic_perfmon [ @ServerName,@EndTime,@WindowSize,@DataGroupID,@CollectionSetUid]

The @EndTime and @WindowSize parameters allows you to pull as much data as you like from the MDW database. The @DataGroup parameter identifies the type of report: ServerActivity, SqlActivity, SqlMemoryUsage, SystemCpuUsage, SystemDiskUsage, SystemMemoryUsage.  (Note: for any of these use N’49268954-4FD4-4EB6-AA04-CD59D9BB5714′ for the CollectionSetUid.)

The data collected by the system collection sets can be quite valuable once you get to know it.  As part of a monthly review for each instance I like to use these stored procedures to pull a month of data from the MDW database into a csv file. Then, back on my PC, I can use that file to build excel charts to add to a monthly SQL instance review.

Usage

Take control of the size and growth of the MDW database. Don’t blindly accept the default retention days on the System Data Collection Sets (14 days for Server and Query, 730 days for Disk). Instead configure each Data Collector to keep only as much as you require. Or use the @retention_days parameter of the core. sp_purge_data proc to keep all the data purged to a reasonable range. Per this KB article, the purge process has been improved, however you might still review this MSDN question and the links within – especially regarding indexing – if you run into problems with the purge process.

Likewise, review the default schedules of the data collection job, especially for stacked instances. Don’t allow two upload jobs to run at the same 15 minute interval.

Turn off the Query Statistics Data Collection Set. Cool as it is, it takes a lot of disk space to archive this data. Also, some of the reports based of this collection reference the plan cache. If the query you are examining does not still exist in cache the report may not be complete. Approach this data collector and its reports more like a profiler trace. Start the Query Statistics Data Collection Set for a period of interest, then stop it and analyze the results.

Stagger the start times for upload jobs for instances that share a server, but also –if the MDW is centralized – across the environment to minimize insert conflicts within the MDW database.

And finally, if you need to clean up a Data Collection and MDW setup, refer to this very comprehensive post from Aron Bertrand.

What have your experiences been? Let me know, I’d love to hear from you.

Contact Us
Andy McDermid
Principal SQL Server DBA
Andy is a MCITP certified MS SQL DBA who delivers and manages delivery of DBA services to many diverse clients. He enjoys helping his clients by finding and deploying pragmatic and practical solutions for their database issues. Andy is always working to improve and expand his DBA skills and he likes to share the experience via writing.

Leave a Reply

Your email address will not be published.
Required fields are marked (*).

2 thoughts on “Who is afraid of the Data Collector? (Part 2)”
  1. Hi Andy. I saw this great article on the web and liked it. I have the same issue you mentioned with the MDW Disk Usage for Database Report Error – A data source has not been supplied for the data source DS_TraceEvents , when clicking the report from central MDW server on a remote server disk usage report, bu i noticed one thing:

    This affects only user databases and not system databases, so on any of my 25 server i use datacollect on and send it to a central MDW server I could from that central server click and read the Disk Usage report if the database i click the hyperlink for is systemdatase = master, tempdb, msdb or model. I as you says if I click the link for a user database I get that error. I seems like Microsoft wouldn’t solve this issue? Have you managed to get any further with this? Many thanks and Best Regards

    Micael Carlsson
    DBA at Eskilstuna Kommun
    Sweden

  2. Hi Micael, I believe your reports for system DBs return data because – no matter what instance you are on- those DBs are always there. So, we see the populated report, but it is for the local sys dbs, not the instance where the data collectors are polling metrics. To look at it another way, if you added an empty ‘dummy’ DB named DS_TraceEvents to your central server, the report for that DB may work (haven’t tried this yet, just a thought) – i.e. the error basically states that DB is not there, so if we add the DB it should work.
    Anyway, my idea in the post here is to host an MDW DB locally on each instance rather than a central MDW, then aggregate the metric data as a more manual process using the stored proc examples – this is one way to avoid the SSMS report issue. I haven’t tried the report in SQL2014 yet, but the bug was still there in 2012. Thanks for you comment!