Press "Enter" to skip to content

Curated SQL Posts

SQL Operations Studio March Release

Alan Yu announces a new version of SQL Operations Studio:

Take a moment to breathe, this is a loaded update.

The March Public Preview release is focused on improving our Extensibility story and continuing to address top GitHub issues. This includes enabling Extension Manager, improving the Manage Dashboard experience and providing a couple Insights extensions. Please see the following details.

  • Enhance the Manage Dashboard extensibility model to support tabbed Insights and Configuration panes

  • Dashboard Insights extensions for sp_whoisactive from whoisactive.com and a Server Reports example

  • Extension Manager enables simple acquisition of 1st-party and 3rd-party extensions

  • Add additional Extensibility APIs for connection and objectexplorer management

  • Community Localization open for 10 languages

  • Continue to fix important customer impacting GitHub issues

There’s some nice stuff in this release, but the big story is around extensibility.

Comments closed

Using DATETIMEOFFSET

Randolph West continues his date and time data type series:

DATETIMEOFFSET works the same way as the DATETIME2 data type, except that it is also time zone aware. It is formatted as 'YYYY-MM-DD HH:mm:ss[.nnnnnnn][{+|-}hh:mm]'.

Got all that? YYYY represents a four-digit year, MM is a two-digit month between 1 and 12, DD is a two-digit day between 1 and 31 depending on the month, HH represents a two-digit hour between 0 and 23, mm is the minutes between 0 and 59, while ss is the number of seconds between 0 and 59. Once again, n represents between zero and seven decimal places in a fraction of a second.

The main difference from DATETIME2 is the time zone offset at the end, which is the number of hours and minutes as an offset from UTC time.

Read on for more.  I generally don’t use this date type much, preferring to stick with DATETIME2 and saving data as UTC.

Comments closed

Why Does Empirical Variance Use n-1 Instead Of n?

Sebastian Sauer gives us a simulation showing why we use n-1 instead of n as the denominator when calculating the variance of a sample:

Our results show that the variance of the sample is smaller than the empirical variance; however even the empirical variance too is a little too small compared with the population variance (which is 1). Note that sample size was n=10 in each draw of the simulation. With sample size increasing, both should get closer to the “real” (population) sample size (although the bias is negligible for the empirical variance). Let’s check that.

This is an R-heavy post and does a great job of showing that it’s necessary, and ends with  recommended reading if you want to understand the why.

Comments closed

Testing Disk Speed With diskspd

Marek Masko shows how to test I/O performance using the diskspd utility:

What is Diskspd?

Diskspd is a storage testing tool created by Microsoft Windows, Windows Server and Cloud Server Infrastructure Engineering teams. It combines robust and granular IO workload definition with flexible runtime and output options. That makes it a perfect tool for storage performance testing, validation and benchmarking.

Where to find Diskspd?

Diskspd is a free and open source utility. Its source code can be found on GitHub. The repository also hosts other frameworks which use Diskspd. You can find them under ‘Frameworks’ directory. A binary release is hosted by Microsoft at the following location: http://aka.ms/diskspd.

Click through for more details, including an example of a poorly-performing I/O solution.

Comments closed

Finding SQL Server Instances With dbatools

Chrissy LeMaire shows off a very helpful command in dbatools:

Nearly every time I inherit a SQL Server environment, I’m only given a partial list of SQL Servers that exist on the network. It’s my usual routine to get permission to sniff the network then run about five different programs including Idera’s SQL Discovery and Microsoft’s SQL Server Assessment and Planning Toolkit.

I always thought it’d be cool to have one comprehensive PowerShell command that could do the work of all the above and was ecstatic to see NetSPI’s Scott Sutherland had written a few commands to do just that in his awesome PowerShell module PowerUpSQL.

When I saw Scott’s multi-pronged approach (including some UDP magic 🎩), I asked if he’d be interested in contributing to dbatools and he said yes! He submitted a gorgeous mock-up and I was so excited. Then came the PR, complete with great documentation and multithreading.

Click through for a lot more information on the command.

Comments closed

Resuming Azure SQL Data Warehouse With Powershell

Arun Sirpal shows how to unpause an Azure SQL Data Warehouse instance using Powershell:

I totally forgot that with Azure SQL DWH you can pause and resume compute, to save money because it is expensive. Question is how do you go about resuming compute? TSQL is not possible and sure you can do the change via Azure portal but what about PowerShell?

This makes it easy to script out an overnight data load and then pausing the Azure Data Warehouse until the morning when those analysts come in, so that you can save a bit of cash (or a lot, depending upon your DWU utilization).

Comments closed

Comparing Distinctness

Michael J. Swart shows several options for comparing whether an attribute’s value is distinct from a parameter:

Check it:

DECLARE @TeamId bigint = NULL,
    @SubTeamId bigint = NULL;
 
SELECT TOP 1 TaskId
FROM tasks
WHERE assignedTeamId IS NOT DISTINCT FROM @TeamId
  AND assignedSubTeamId IS NOT DISTINCT FROM @SubTeamId

Talk about elegant! That’s what we wanted from the beginning. It’s part of ANSI’s SQL 1999 standard. Paul White tells us it’s implemented internally as part of the query processor, but it’s not part of T-SQL! There’s a connect item for it… err. Or whatever they’re calling it these days. Go read all the comments and then give it a vote. There are lots of examples of problems that this feature would solve.

PROS: Super-elegant!
CONS: Invalid syntax (vote to have it included).

This would be nice to have.  In the meantime, Michael shows several options which are currently valid syntax.

Comments closed

Rotating Out Partitions

Kendra Little explains that there are a couple of models available for partitioned table management:

I recently received a terrific question about table partitioning:

I’m writing up a proposal for my company to start partitioning a 2.5TB table. The idea/need is to control the size of the table while saving the old data. The old data will be moved to an archive database on a different server where the BI guys work with it.

In none of the videos articles I’ve seen is the explanation of how the rolling partition works on a long term daily basis.

  1. Are the partitions reused, like in a ROUND ROBIN fashion?
  2. Or, do you add new partitions each day with new filegroups, drop the oldest partition off – this would be FIFO?

Lots of folks assume the answer here is always #2, simply because there’s a bunch of sample code out there for it.

But option #1 can be simpler to manage when it fits your data retention technique!

Click through to learn more about reusable partitioning.

Comments closed

Second-Order SQL Injection Attacks

Bert Wagner explains what he calls second-order SQL injection attacks:

SQL injection attacks that delay execution until a secondary query are known as “second order”.

This means a malicious user can inject a query fragment into a query (that’s not necessarily vulnerable to injection), and then have that injected SQL execute in a second query that is vulnerable to SQL injection.

Let’s look at an example.

Another way of thinking about this is a persisted SQL injection attack, akin to reflected versus persisted cross-site scripting attacks.  The fix is, don’t trust unsanitized user input.  Just because you put a user’s data into your database doesn’t mean that someone sanitized it, so treat that stuff as unsafe unless you know otherwise.

Comments closed

Avro Schema Compatibility In Kafka

Neha Bhardwaj walks us through an error in Kafka:

You might have come across a similar exception while working with AVRO schemas.

Kafka throws this exception due to a compatibility issue since the current schema is not compatible with the earlier schema registered on this topic.

You can check the current schema(s) on the topic using:
curl -X GET <a href=”http://localhost:8081/subjects//versions/”&gt;http://localhost:8081/subjects//versions/

Read on to understand what this error means and how you can fix it if you see it.

Comments closed