Press "Enter" to skip to content

Curated SQL Posts

What Is Power BI?

Angela Henry gives a high-level overview of Power BI:

There are lots of reasons to use Power BI, other than, it’s so cool.  For instance, Power BI makes it easy to see, in one glance, all the information needed to make decisions.  It also allows you to monitor the most important information about your business.  Power BI makes collaboration easy and when I say easy I mean EZ!  You can also create customized Dashboards tailored to those C-Suite folks or make a completely different dashboard based on the same data for those that actually do the work.

I’m personally astounded at how far visualization tools have come in half a decade.

Comments closed

Spatial Data

Dave Mattingly has a multi-part series on spatial data.  This is part 5 (with links to the previous).

 

If you’re interested in spatial data, this looks like a fantastic set of blog posts which mesh well with Dave’s presentation on spatial data.

Comments closed

Migrating To Azure SQL Database

James Serra has a good post on moving your on-premise SQL Server instance up to Azure SQL Database:

In this migration process you migrate both your schema and your data from the SQL Server database in your current environment into SQL Database, provided the existing database passes compatibility tests.  Fortunately with SQL Database Version 12 (V12), there are very few remaining compatibility issues other than server-level and cross-database operations (see Azure SQL Database Transact-SQL differences).  Databases and applications that rely on partially or unsupported functions will need some re-engineering to fix these incompatibilities before the SQL Server database can be migrated.

Two years ago, I would have laughed at the idea.  Right now, I’m skeptical.  My expectation is that, two years from now, this will be my default answer for non-sensitive data.

Comments closed

Defaults Not Guaranteed Equal

Michael J. Swart shows that two DATETIME2 columns with default constraints will not necessarily show the same value upon insertion:

If I want to depend on these values being exactly the same, I can’t count on the default values.

Default constraints will fill in the correct value, but as Michael notes, “the correct value” is calculated each time.  Also, note that his results are about a millisecond off, so if you’re just using DATETIME, the frequency of observation of this occurrence will be lower, as DATETIME is only good to 3 milliseconds.  That’s not a good reason to use DATETIME, though.

Comments closed

Data Manipulation In R

Casimir Saternos has an article on matrix operations and other data transformations in R:

Operations that are conceptually simple can be difficult to perform using SQL.  Consider the common requirements to pivot or transpose a dataset.   Each of these actions are conceptually straightforward but are complex to implement using SQL.  The examples that follow are somewhat verbose, but the details are not significant. The main point is to illustrate is that, by using specialized functions outside of SQL,  R makes trivial some of those operations that would otherwise require complex SQL statements.  The contrast in the amount of code required is striking.  The simpler approach allows you to focus attention on the scientific or business problem at hand, rather than expending energy reading documentation or laboriously testing complex statements.

I consider this where the second-order value of R comes in.  The initial “wow” factor is in how easy you can plot things, and this ease of data cleansing is the next big time-saver.

Comments closed

Monitoring MDX Query Result Serialization

Chris Webb digs into MDX query serialization.

Part 1:

The Serialize Results Begin event marks the point where SSAS starts to construct the cellset returned. The Serialize Results Current events that immediately follow it, with EventSubclass “1 – Serialize Axes” show SSAS serialising the tuples that are present on the Columns axis (listed as Axis 0 in the TextData column), the Rows axis (Axis 1) and the Where clause (Slicer Axis). The numeric values in the ProgressTotal column for the Serialize Results Current events shows the number of tuples on each axis: the two tuples on columns are the two measures, the six tuples on rows are the six years, and there’s one tuple on the slicer. After that SSAS gets the data for each of the cell values (as shown by the Query Subcube Verbose event – note that this query is running on a warm cache) and there is then a Serialize Results Current event with EventSubclass “2 – Serialize Cells”; the ProgressTotal column shows that twelve cells (2 columns * 6 rows) in total were returned.  The Serialize Results End event shows that SSAS has finished constructing the cellset and not surprisingly it’s followed immediately by the Query Cube End and Query End events.

Part 2:

There are a couple of interesting things to note about this query. First, SQL Server Management Studio on my laptop says that it takes nine seconds to run, even on a warm cache; the Duration column for the Query End event in Profiler, however, shows a value of around six seconds. The three second difference must be the time it takes for SSAS to return the cellset to SQL Server Management Studio, and for SQL Server Management Studio to render the results (my guess is that it’s the second operation that takes the majority of this time – other client tools may be more efficient at rendering large resultsets).

I have so little experience with MDX that this is a wide world of amazement to me.  So it’s very nice that Chris is digging into this and peeling away some of the mystery.

Comments closed

Finding A Query In Query Store

Grant Fritchey shows us how to find a query in the Query Store:

The primary views you’ll want are sys.query_store_query and sys.query_store_query_text. They join together based on the query_text_id. Let’s take four scenarios and see if we can retrieve the correct query.

This is a little more complicated than I would have hoped, but as Grant notes, this is the CTP, so maybe there will be some tooling added to make life a little easier.

Comments closed

Index Scans

Gail Shaw proves that sometimes, an index scan isn’t a full index scan:

A scan of the entire index is over 1600 pages. This query read three. It’s a scan, but it’s a scan which stopped after reading one page of the leaf (the other two are likely the root and intermediate pages, used to locate the first page in the leaf).

The scan read one row and then stopped, because that’s all that was needed. It did that, because there was a 1 row row-goal added to the query. For more details on row goals, see Paul White’s article on the subject.

Read the whole thing.

Corollary:  sometimes a seek isn’t really a seek; sometimes it’s a scan even when the icon says “seek” because that’s more efficient.

1 Comment