Press "Enter" to skip to content

Month: July 2018

Get-DbaDbCompression DBATools Cmdlet

Jess Pomfret walks us through her recent contribution to dbatools:

I’ve been using this at work recently and it also relates to the presentation I gave at the ONSSUG June meeting around data compression. The beginnings of this script originated online as I dug into learning about the DMVs that related to objects and compression and then customized for what I needed.

If you run the below as is it will provide basic information about all objects in your database, except those in the ‘sys’ schema, along with their current size and compression level.

Click through for the script or, if your version of dbatools is up to date, call Get-DbaDbCompression.

Comments closed

Building A Calendar Table

Louis Davidson has an example of a calendar table in SQL Server:

The solution is part of my calendar/date dimension code, and it is used to do relative positioning over date periods. For example, say you have the need to get data from the 10 days. You can definitely use a simple between to filter the rows, and a bunch of date functions to group by year, month, etc., generally all of the “normal” groupings. But using a calendar table allows you to prebuild a set of date calculations that make the standard values easier to get, and non-standard groupings possible. The technique I will cover makes moving around in the groupings more easily accessible. Like if you want data from the last 3 complete months. The query to do this isn’t rocket science, but it isn’t exactly straightforward either.

For the example, I will use the calendar table that I have on my website here: http://drsql.org/code in the download SimpleDateDimensionCreateAndLoad, and will load it with data up until 2020. Here is that structure:

Read on for examples of usage.  This is an example where thinking relationally differs from thinking procedurally—imagining date ranges as pre-calculated sets isn’t intuitive to procedural developers, but it can give a big performance boost.

Comments closed

Parallelism Strategies For Grouping Operations

Itzik Ben-Gan continues his series on grouping data in SQL Server by looking at how these operations can go parallel:

Besides needing to choose between various grouping and aggregation strategies (preordered Stream Aggregate, Sort + Stream Aggregate, Hash Aggregate), SQL Server also needs to choose whether to go with a serial or a parallel plan. In fact, it can choose between multiple different parallelism strategies. SQL Server uses costing logic that results in optimization thresholds that under different conditions make one strategy preferred to the others. We’ve already discussed in depth the costing logic that SQL Server uses in serial plans in the previous parts of the series. In this section I’ll introduce a number of parallelism strategies that SQL Server can use for handling grouping and aggregation. Initially, I won’t get into the details of the costing logic, rather just describe the available options. Later in the article I’ll explain how the costing formulas work, and an important factor in those formulas called DOP for costing.

As you will later learn, SQL Server takes into account the number of logical CPUs in the machine in its costing formulas for parallel plans. In my examples, unless I say otherwise, I assume the target system has 8 logical CPUs. If you want to try out the examples that I’ll provide, in order to get the same plans and costing values like I do, you need to run the code on a machine with 8 logical CPUs as well. If you’re machine happens to have a different number of CPUs, you can emulate a machine with 8 CPUs—for costing purposes—like so:

DBCC OPTIMIZER_WHATIF(CPUs, 8);

Even though this tool is not officially documented and supported, it’s quite convenient for research and learning purposes.

This is a fairly long article, but a great one.

Comments closed

Installing Confluent Platform On Windows

Niels Berglund shows how to install Confluent Platform (the Confluent branded version of Apache Kafka) on a Windows machine using the Windows Subsystem for Linux:

WSL is primarily aimed at developers, and it allows you to run Linux environments directly on Windows in a native format and without the overhead of a virtual machine. Let us retake a look at that statement: run Linux environments directly on Windows in a native format. Yes native format, WSL is not a UNIX-like environment like Cygwin, which wraps non-Windows functionality in Win32 system calls but it serves Linux programs as special, isolated minimal processes (pico-processes) attached to kernel-mode pico-providers. If you want to read all the “gory” details about WSLWindows Subsystem for Linux Overview gives you an excellent introduction.

Surprisingly, it’s pretty easy—I would have expected some strange compatibility issues.

Comments closed

Building Cone Plots In Plotly

The Plotly blog shows how to use Python to build 3D cone plots using Plotly:

This plot uses an explicitly defined vector field. A vector field refers to an assignment of a vector to each point in a subset of space.

In this plot, we visualize a collection of arrows that simply model the wind speed and direction at various levels of the atmosphere.

3-D weather plots can be useful to research scientists to gain a better understanding of the atmospheric profile, such as during the prediction of severe weather events like tornadoes and hurricanes.

Sometimes a 3D plot is the best answer.  When it is, this looks like a good solution.  H/T R-bloggers

Comments closed

Using APPLY To Aggregate Unpivoted Data

Dan Clemens gives us yet another practical use of the APPLY operator:

I had a situation last week where I needed to find the MIN() and MAX() values of some data. Normally this would be pretty straightforward, but the problem was I needed to identify the MIN() and MAX() values amongst multiple columns. So today I bring you a SQL tip using APPLY.

Looking at sys.procedures we can see I have a create_date and a modify_date column for each record. What if we needed to return the MIN() or MAX() value from those 2 (or more) columns?

SELECT [Name]
,create_date
,modify_date
FROM sys.procedures AS p
WHERE p.[name] = 'ChangePassword';

In this two-column example, it’s not too difficult.  As you add more and more columns, the solution remains the same, though the urge to ask why all of these dates are unpivoted might increase…

Comments closed

Sorting When Your Measure Is Not In The Visual

Kasper de Jonge shows us different ways of sorting a visual by some unrelated measure:

So lets start with the simple one, I want to sort a chart on a measure not part of the visual. Let’s take this visual:

Now instead of sorting by OrderQuantity I want to sort by the ListPrice. The trick here is to make the measure part of the query, and one way you can do that is by adding it to the tooltip

Read on for examples for charts as well as matrices.

Comments closed

Managing Central Management Server

Chrissy LeMaire shows how you can use dbatools to manage Central Management Server and registered servers:

It’s a super useful feature that not all DBAs know about. Since CMS data is stored in msdb and accessible via SMO, you can access it from SQL Server Management Studio or PowerShell modules like dbatools.

Central Management Server’s essential functionality includes:

Actions that are taken by using a central management server group act on all servers in the server group. This includes connecting to servers by using Object Explorer and executing Transact-SQL statements and Policy-Based Management policies on multiple servers at the same time.

I mostly use it as a visual repository of my SQL Servers. Prior to using dbatools and Invoke-DbaSqlQuery, however, I did use CMS to easily execute code against a number of different servers.

CMS is a great feature, and is a critical tool for scaling out a SQL Server infrastructure.

Comments closed

Checking File Sizes In SQL Server

Andy Mallon looks back at a contribution by Junior DBA Andy, this one on checking file sizes:

This is every DBA’s favorite game. Figuring out what DMV contains the data you want. It turns out there are two places that database file size info is maintained. Each database has sys.database_files which has information for that database. The master database also has sys.master_files, which contains information for every database.

Using sys.master_files seems like it would be the obvious choice: everything in one view in master is going to be easier to query than hitting a view in a bunch of different databases. Alas, there’s a minor snag. For tempdb, sys.master_files has the initial file size, but not the current file size. This is valuable information, but doesn’t answer the use cases we set out above. I want to know the current file sizes. Thankfully, sys.database_files has the correct current file sizes for tempdb, so we can use that.

Using sys.database_files seems like it’s going to be the right move for us then. Alas, this isn’t quite perfect either. With Log Shipping, and Availability Group secondaries, if you’ve moved data files to a different location, sys.database_files will contain the location of the files on the primary database. Thankfully, sys.master_files has the correct local file locations for user databases, so we can use that.

Ugh, so it looks like the answer is “both”… we’ll need to use sys.database_files for tempdb, and sys.master_files for everything else.

Click through for the script, including Andy’s critical reflection on how Past Andy has failed him.

Comments closed

Adding IN Search Functionality To .NET

Jay Robinson shows off a few extension methods he creates to make dealing with C# easier:

Then I could use the extension like this:

if (mySeries.In(Enum.Series.ProMazda, Enum.Series.Usf2000)) myChassis = "Tatuus";

As for the other two methods, well… When is a null not a null? When it’s a System.DBNull.Value, of course! SQL Server pros who have spent any time in the .NET Framework will recognize this awkwardness:

var p = new System.Data.SqlClient.SqlParameter("@myParam", System.Data.SqlDbType.Int);
p.Value = (object)myVar ?? System.DBNull.Value;

With the extension, the second line becomes:

p.Value = mVar.ToDbNull();

I like it that Jay ended up going with a different language than T-SQL.  It’s no F#, but it’ll do.

Comments closed