Author: Kevin Feasel

So just remember the only difference when analyzing settings is the difference in Query Store Capture Mode. For Azure it is set to AUTO whereas with local installed SQL Servers it is set to ALL.

What does this mean? ALL means that it is set to capture all queries but AUTO means infrequent queries and queries with insignificant cost are ignored. Thresholds for execution count, compile and runtime duration are internally determined.

Read on to learn more, including how to change these settings.

Comments closed

Finding AD Group Members With Powershell

Published 2017-10-25 by Kevin Feasel

Amy Herold has a quick script to find which Active Directory users are in particular AD groups:

There was something that popped up today that called for a PowerShell script and the Get-ADGroupMember cmdlet – get a list of users from a list of groups. Some users are in there more than once so this needs to be a distinct list, unless you are into manually cleaning up things like this, and then I will be sad for you. Because that is kinda sad.

I originally wrote a script with two arrays (one for the initial list and one for the de-duped list of users), but even though this is quick and dirty, that was a little too dirty. Enter the Group-Object cmdlet – it takes this list of names and groups them. No black magic this time. Just a cmdlet, that comes baked into PowerShell giving me what I need.

Click through for the script.

Comments closed

Build Versus Buy For Hadoop

Published 2017-10-24 by Kevin Feasel

Tom Phelan walks through some thoughts on whether to build versus buy when using big data platforms:

This means you absolutely must sweat the details up front. Big Data project failures are more often than not predicated by the statement: “We will do this bit now, and figure the rest out later”. But you need to begin with the end in mind.

You need to know the performance that you’ll be able to deliver and what your requirements are. You need to know how to integrate with your corporate Active Directory, and LDAP, and Kerberos services. You need to know your network topology and security requirements as well as the required user roles and responsibilities breakdown. You need to know how you’ll handle high availability, QoS, and multi-tenancy. You need to know how you’ll manage upgrades to the latest versions of your Hadoop distribution or other big data tools, and how you’ll respond to requests for new big data frameworks and new data science tools. If not, you’re just asking for trouble.

The motif in his post is building your own car, which makes sense as an extended metaphor.

Comments closed

The Correct Way To Load Libraries In R

Published 2017-10-24 by Kevin Feasel

Gerald Belton opens a can of worms:

When I was an R newbie, I was taught to load packages by using the command library(package). In my Linear Models class, the instructor likes to use require(package). This made me wonder, are the commands interchangeable? What’s the difference, and which command should I use?

Interchangeable commands . . .

The way most users will use these commands, most of the time, they are actually interchangeable. That is, if you are loading a library that has already been installed, and you are using the command outside of a function definition, then it makes no difference if you use “require” or “library.” They do the same thing.

… Well, almost interchangeable

Read on to understand the differences between the two. I end up doing something very similar to his code snippet for exactly the reason he describes. H/T R-Bloggers

Comments closed

How To Create Difficult Measures In Power BI

Published 2017-10-24 by Kevin Feasel

Matt Allington walks through his process of how he creates measures in Power BI:

Killer Tip 1: Create Good Test Data

The first thing I did was to replicate the test data shown above. As I have mentioned many times, good quality test data is essential to getting a quick correct answer to your problem. The data from the OP looked pretty good as it had covered the relevant scenarios (1 ID had just blue, 1 ID had blue and red, several IDs had no blue – this is good test data).

Starting out with good test data is vital—it helps you clarify exactly what it is you want, and if you come up with edge cases, you have the makings of a good test workbench to ensure that your code actually works, and not just in the simplest scenario.

Comments closed

Don’t Unit Test Private Methods

Published 2017-10-24 by Kevin Feasel

Vladimir Khorikov argues that you should not unit test private methods:

When your tests start to know too much about the internals of the system under test (SUT), that leads to false positives during refactoring. Which means they won’t act as a safety net anymore and instead will impede your refactoring efforts because of the necessity to refactor them along with the implementation details they are bound to. Basically, they will stop fulfilling their main objective: providing you with the confidence in code correctness.

When it comes to unit testing, you need to follow this one rule: test only the public API of the SUT, don’t expose its implementation details in order to enable unit testing. Your tests should use the SUT the same way its regular clients do, don’t give them any special privileges. Here you can read more about what an implementation detail is and how it is different from public API: link.

In the database world, this is one reason why I like using stored procedures: they give the equivalent of a public API for database code, so you can write tests for them.

Comments closed

Azure SQL Database FAQ

Published 2017-10-24 by Kevin Feasel

Dimitri Furman answers some common questions about Azure SQL Database:

Q7. Can I use Windows Authentication in Azure SQL Database?

The short answer is no. Therefore, if you are migrating an application dependent on Windows Authentication from SQL Server to Azure SQL Database, you may have to either switch to SQL Authentication (i.e. use a separate login and password for database access), or use Azure Active Directory Authentication (AAD Authentication).

The latter is conceptually similar to Windows Authentication in the sense that connections from directory principals are authenticated without the need to provide additional secrets, such as a password. Since Azure Active Directory can be federated with the on-premises Active Directory Domain Services, it can effectively authenticate the same Active Directory principals that could access the database prior to migration. However, the authentication flow for AAD Authentication is significantly different, so the analogy with Windows Authentication only goes so far.

There are some good questions in here, especially the one about retry logic; that’s good to have in any situation, but becomes vital when working with a cloud service.

Comments closed

How Statistics In SQL Server Have Changed Over The Years

Published 2017-10-24 by Kevin Feasel

Erin Stellato gives us a version-based timeline of how SQL Server has handled statistics over the years:

SQL Server 2008

Filtered statistics are introduced, and these can be created separately from a filtered index. There are some limitations around filtered indexes with regard to the Query Optimizer (see Tim Chapman’s post The Pains of Filtered Indexes and Paul White’s post Optimizer Limitations with Filtered Indexes) post, and it’s important to understand the behavior of the counter that tracks modifications (and thus can trigger automatic updates). See Kimberly’s post Filtered indexes and filtered stats might become seriously out-of-date for more details, and I also recommend checking out her stored procedure that analyzes data skew and recommends where you can create filtered statistics to provide more information to the Query Optimizer. I’ve implemented this for several large customers that have VLTs and skewed distribution across columns frequently used in predicates.
Two new catalog views, sys.stats and sys.stats_columns, are added to provide easier insight into statistics and included columns. Use these two views instead of sp_helpstats, which is deprecated and provides less information.

This is a very interesting historical look. Most interesting to me was the decreases in the number of steps available.

Comments closed

Inline U-SQL Functions

Published 2017-10-23 by Kevin Feasel

Damien Widera shows us how to write inline functions in U-SQL:

Now let’s go to the new thing – undocumented usage of inline functions. My function is pretty simple and I can imagine that function you will write could be as simple as mine but your functions will probably do something more useful. To simplyfy the coding process you could use inline function in your USQL script and not have to write any code in the C# file.

The could could look like this:

DECLARE @in string = "/Data/Aircraft/2006ByMonth/{*}.tsv";
DECLARE @out string = "/Data/Aircraft/2006ByMonth/out/CSharpFunction.tsv";
DECLARE @func Func<int,int> = (s)=>{return s+1;};

Things which make languages look more like functional languages generally get a thumbs up from me.

Comments closed

Copying Azure Data Lake Databases

Published 2017-10-23 by Kevin Feasel

Yanan Cai shows how to copy Azure Data Lake databases for local debugging and development:

The concept of a database is used to group related data structures and functions together. ADLA users have databases in their production environment that contain tables, assemblies, table valued functions and other objects. Previously, when developing and tuning U-SQL queries on a local machine, developers would have to manually recreate everything in their production database. After coding they would have to identify any changes to the database and then update the production account’s database. This process took extra time and introduced errors without adding any value.

Using the Export Wizard, developers can clone the existing database environment and sample data directly to the local account. Developers can also choose to export only parts of the database to the local database. Follow below steps to export your U-SQL databases.

Click through for the step-by-step process.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31