Creating Database Snapshots

David Fowler has a script which lets you create database snapshots easily:

The procedure takes two parameters,

@DatabaseList – a comma delimited string of database names, allows wildcards
@ListOnly – 1- a list of affected databases will be displayed but snapshots aren’t created.
0- Snapshots are created automatically DEFAULT

I’m a big fan of database snapshots in development and QA environments—take a snapshot, run a workload, revert the snapshot.

Azure Data Lake Analytics Pipelines

Yan Li notes that Azure Data Lake Analytics now offers the ability to manage pipelines:

To make it easier to manage and understand jobs, ADLA now captures the pipeline and recurrence information for each job. This information can be used to connect and organize jobs belonging to the same pipeline or recurring instances. As shown in Fig 2, now jobs are organized by pipeline and recurring instances which enable you to:

  • Quickly identify jobs in pipelines which may have failed or taken longer than expected.

  • Get the aggregated statistics (e.g. job counts, successful and failed AU hours etc.) for a pipeline or a recurring instance

This is an interesting improvement.

Integrating Azure Data Catalog With Power BI

Gaston Cruz shows how to tie view Azure Data Catalog data in Power BI:

A Self Service culture will allow to address analysts to generate their own reports, lists, and dashboards without dependence on the schedule and availability of IT staff. In these cases reports combine different sources of information are generated, many of which may not have been used historically in the company, and this in turn implies that a large number of cases which source you do not know used to implement certain reports.

Azure Data Catalog comes as an option to break that cycle of discovery that is usually done manually. This means that after the first cycle where the business analyst discovers the sources of optimal data to generate certain reports the can register, and add information (metadata) to make this source easier to discover future analysts requiring such data for the implementation of similar reports. The discovery of these sources, and capability to add metadata are procedures do not have to give at the same time but Data Catalog allows work annotations by analysts as a continuous work in time where more information is added to the repository every time.

Click through for a demo.

Automatically Fix Those VLFs

Tracy Boggiano has a script which will fix log files with high virtual log file counts:

First part of the process if to capture the info from DBCC LOGINFO or if you are ready for 2017 the new dmv sys.dm_db_log_stats into a table you can read later to know how many VLFs exist in your database currently. So we going to create table called VLFInfo and used the procedure VLF_UpdateInfo to populate that data.  The procedure would be called in step one of a SQL Agent Job to automate the fixing of VLF files during appropriate maintenance windows on your server.  But as you will see in the Step 2 may solutions tries to account for not doing to close to when the file just grew an acquired those new lovely extra VLFs.

Read on for the code.

Explaining Confidence Intervals

Mala Mahadevan explains what confidence intervals are:

Suppose I look at a sampling of 100 americans who are asked if they approve of the job the supreme court is doing. Let us say for simplicity’s sake that the only two answers possible are yes or no. Out of 100, say 40% say yes. As an ordinary person, you would think 40% of people just approve. But a deeper answer would be – the true proportion of americans who approve of the job the supreme court is doing is between x% and y%.

How confident I am that it is?  About z%. (the common math used is 95%).  That is an answer that is more reflective of the uncertainty related to questioning people and taking the answers to be what is truly reflective of an opinion. The x and y values make up what is called a ‘confidence interval’.

Read the whole thing.

An Analysis Of The Utility Of Power BI Report Server

Meagan Longoria has an after-action report of a proof of concept using Power BI Report Server:

In addition to managing to versions of Power BI Desktop, I also found myself mentally managing two sets of features. I was constantly asking myself “Can I do that in Power BI Report Server?”. Some of that is because PBI Desktop for Report Server is on a quarterly release cycle rather than monthly, so I had to remember if a feature I wanted to use was new (or in preview) and therefore not available in this version. The other part is trying to remember what you can and cannot do with a Live Connection. For example, you can make report measures, but you can’t use ad hoc grouping and binning.

We had several scenarios where users wanted to be able to group fields in multiple ways that changed somewhat frequently. Since we couldn’t use grouping and binning in Power BI Desktop to accomplish this, we set up an Excel data source in the SSAS Tabular model, and allowed users to change the groups there and refresh the Tabular model when finished. This could get rather unwieldy if you had lots of users who needed this kind of flexibility.

Ultimately, the customer considered it a success.  Read on for more details.

Active Directory Integration On Ubuntu

Drew Furgiuele shows how to configure SQL Server on Linux to use Windows authentication:

In the following post, we’ll walk through joining a Linux SQL Server on Unbuntu to an Active Directory domain, and here’s the steps we’re going to take:

  1. Installing the required software and services to enable a Linux host to talk to and join an Active Directory Domain,
  2. Configuring the Linux host’s network configuration to talk to the Domain Controller(s),
  3. Setting up Samba, Kerberos, Winbind, and the System Security Services Daemon (SSSD) to properly talk to and digest authentication tokens from Active Directory, and
  4. Creating a Kerberos Keytab file for the SQL Server service to run as a domain service account.

Seems like a lot, doesn’t it? If you’re new to Linux, a lot of this configuration can seem a little daunting and a lot tedious, but as we walk through it, I’ll stop and talk a little bit about each step and what it does.

Active Directory integration was a critical piece of functionality for SQL Server on Linux.  There are still some odd edge cases (like weirdness when going cross-domain) but for the normal scenario, it works fine once you’ve configured Linux correctly.

Temporal Table Time Zones

Louis Davidson talks about how to translate temporal table start and end dates to your local time zone:

In my first 5 blogs on Temporal, I failed to note something pretty important about their usage. The start and end times for the row (and it follows, the historical rows) are likely not in your local time zone. They are stored in UTC time zone. This is obviously good because of that pesky daylight saving time issue where 1:30 AM occurs twice each year (here in the US, the time occurs on the first Sunday of November).

Unless you live in London, England or someplace where the offset from UTC is 0, it can make working with these tables confusing, because most people rarely work in UTC time, and even rarer is to think in UTC time when most of your data is likely in your local time zone. So you write your query and use local time…and then, most likely, data is returned…but not necessarily the data you actually desired.

Click through to see ways of translating those values.

Reporting Services Versus Power BI Report Server

John White compares and contrasts SQL Server Reporting Services versus Power BI Report Server:

Power BI Report Server (PBIRS) was first introduced in May 2017. Based on SQL Server Reporting Services (SSRS), it brings the ability to work with Power BI reports completely on premises in addition to all the other capabilities of SSRS. Given this, it would be reasonable to conclude that PBIRS was the next version of, or a replacement for SSRS, but that is not the case. I have heard people state that SSRS is “going away”, but this is simply not the case. SSRS is still a core part of the Microsoft BI stack. So, what are the differences between the two platforms? The differences boil down to features, licensing, and update cadence.

If you’re in the BI/report writing space, you will want to read the whole thing.

Automatic Retry With Optimistic Concurrency

Vladimir Khorikov explains an anti-pattern when dealing with a model using optimistic concurrency (for example, memory-optimized tables):

Alright, back to the original question. So, how to combine optimistic locking and automatic retry? In other words, when the application gets an error from the database saying that the versions of a Product don’t match, how to retry the same operation again?

The short answer is: nohow. You don’t want to do that because it defeats the very purpose of having an optimistic lock in the first place.

Remember that the locking mechanism is a way to ensure that all changes are taken into consideration when changing a record in the database. In other words, someone should review the new version of the record and make an informed decision as to whether they still want to submit the update. And that should be the same client who originated the initial request, you can’t make that decision for them.

Plenty of systems do this sort of data merging automatically, but I get Vladimir’s point:  if someone else pulled the rug out from under you, it might change your decision on what that data should look like.


September 2017
« Aug