Press "Enter" to skip to content

Curated SQL Posts

How To Create Difficult Measures In Power BI

Matt Allington walks through his process of how he creates measures in Power BI:

Killer Tip 1: Create Good Test Data

The first thing I did was to replicate the test data shown above.  As I have mentioned many times, good quality test data is essential to getting a quick correct answer to your problem.  The data from the OP looked pretty good as it had covered the relevant scenarios (1 ID had just blue, 1 ID had blue and red, several IDs had no blue – this is good test data).

Starting out with good test data is vital—it helps you clarify exactly what it is you want, and if you come up with edge cases, you have the makings of a good test workbench to ensure that your code actually works, and not just in the simplest scenario.

Comments closed

Don’t Unit Test Private Methods

Vladimir Khorikov argues that you should not unit test private methods:

When your tests start to know too much about the internals of the system under test (SUT), that leads to false positives during refactoring. Which means they won’t act as a safety net anymore and instead will impede your refactoring efforts because of the necessity to refactor them along with the implementation details they are bound to. Basically, they will stop fulfilling their main objective: providing you with the confidence in code correctness.

When it comes to unit testing, you need to follow this one rule: test only the public API of the SUT, don’t expose its implementation details in order to enable unit testing. Your tests should use the SUT the same way its regular clients do, don’t give them any special privileges. Here you can read more about what an implementation detail is and how it is different from public API: link.

In the database world, this is one reason why I like using stored procedures:  they give the equivalent of a public API for database code, so you can write tests for them.

Comments closed

Azure SQL Database FAQ

Dimitri Furman answers some common questions about Azure SQL Database:

Q7. Can I use Windows Authentication in Azure SQL Database?

The short answer is no. Therefore, if you are migrating an application dependent on Windows Authentication from SQL Server to Azure SQL Database, you may have to either switch to SQL Authentication (i.e. use a separate login and password for database access), or use Azure Active Directory Authentication (AAD Authentication).

The latter is conceptually similar to Windows Authentication in the sense that connections from directory principals are authenticated without the need to provide additional secrets, such as a password. Since Azure Active Directory can be federated with the on-premises Active Directory Domain Services, it can effectively authenticate the same Active Directory principals that could access the database prior to migration. However, the authentication flow for AAD Authentication is significantly different, so the analogy with Windows Authentication only goes so far.

There are some good questions in here, especially the one about retry logic; that’s good to have in any situation, but becomes vital when working with a cloud service.

Comments closed

How Statistics In SQL Server Have Changed Over The Years

Erin Stellato gives us a version-based timeline of how SQL Server has handled statistics over the years:

SQL Server 2008

This is a very interesting historical look.  Most interesting to me was the decreases in the number of steps available.

Comments closed

Inline U-SQL Functions

Damien Widera shows us how to write inline functions in U-SQL:

Now let’s go to the new thing – undocumented usage of inline functions. My function is pretty simple and I can imagine that function you will write could be as simple as mine but your functions will probably do something more useful. To simplyfy the coding process you could use inline function in your USQL script and not have to write any code in the C# file.

The could could look like this:

1
2
3
4
DECLARE @in string = "/Data/Aircraft/2006ByMonth/{*}.tsv";
DECLARE @out string = "/Data/Aircraft/2006ByMonth/out/CSharpFunction.tsv";
DECLARE @func Func<int,int> = (s)=>{return s+1;};

Things which make languages look more like functional languages generally get a thumbs up from me.

Comments closed

Copying Azure Data Lake Databases

Yanan Cai shows how to copy Azure Data Lake databases for local debugging and development:

The concept of a database is used to group related data structures and functions together. ADLA users have databases in their production environment that contain tables, assemblies, table valued functions and other objects. Previously, when developing and tuning U-SQL queries on a local machine, developers would have to manually recreate everything in their production database. After coding they would have to identify any changes to the database and then update the production account’s database. This process took extra time and introduced errors without adding any value.

Using the Export Wizard, developers can clone the existing database environment and sample data directly to the local account. Developers can also choose to export only parts of the database to the local database. Follow below steps to export your U-SQL databases.

Click through for the step-by-step process.

Comments closed

More R Services Internals Spelunking

Niels Berglund continues his series on R Services internals:

What happens is that straight after the AuthenticateConnection you will hit the WriteAsync breakpoint twice, the same way as we see in Code Snippet 4. The first hit at WriteAsync sort of makes sense, as it ties up with the call-stack we see in Code Snippet 5. But what about the second WriteAsync (the one that causes the second package to be sent), where does that come from? To try to figure that out, we start with the call-stack for that particular WriteAsync.

Execute the code in Code Snippet 1 again, and continue to the second WriteAsync. When you hit the breakpoint do a kc again. The call-stack should now look somewhat like this (this time it is the full call-stack):

It’s interesting the kind of stuff you can find with Wireshark and a debugger.

Comments closed

External Memory Pressure With SQL On Linux

Anthony Nocentino explains how SQL Server on Linux reacts to memory pressure:

We can use tools like ps, top and htop to look our are virtual and physical memory allocations. We can also look in the /proc virtual file system for our process and look at the status file. In here we’ll find the point in time status of a process, and most importantly the types of memory allocations for a process. We’ll get granular data on the virtual memory allocations and also the resident set size of the process. Here are the interesting values in the status file we’re going to focus on today.

  • VmSize – total current virtual address space of the process

  • VmRSS – total amount of physical memory currently allocated to the process

  • VmSwap – total amount of virtual memory currently paged out to the swap file (disk)

The differences are going to be interesting for people to troubleshoot later, particularly if you look at SOS_SCHEDULER_YIELD and give a knee-jerk reaction that the problem is with CPU.

Comments closed

Powershell’s Switch Statement

Shane O’Neill shows us Powershell’s switch statement:

We send out notification emails if jobs fail as well but hey, we’re DBAs, we like a backup!

Since this used to be done manually, and since I like PowerShell, I created separate functions that can take individual log files and parse out the needed data in a nice, easy format.

My problem, and the reason for this post, was figuring out, based on the name of the file in a directory, which function to call…

Click through to see how to use the switch statement, as well as how to switch on an expression.

Comments closed