EF Core Merge Statements

Richie Rump looks at SQL that Entity Framework Core generates when inserting a batch of records:

If you’re an experienced SQL tuner, you’ll notice some issues with this statement. First off the query has not one but two table variables. It’s generally better to use temp tables because table variables don’t have good statistics by default. Secondly, the statement uses a MERGE statement. The MERGE statement has had more than it’s fair share of issues. See Aaron’s Bertrand’s post “Use Caution with SQL Server’s MERGE Statement” for more details on those issues.

But that got me wondering, why would the EF team use SQL features that perform so poorly? So I decided to take a closer look at the SQL statement. Just so you know the code that was used to generate the SQL saves three entities (Katana, Kama, and Tessen) to the database in batch. (Julie used a Samurai theme so I just continued with it.)

Yeah…I’m not liking the MERGE statement very much here.

Genomic Analysis In Spark

Tom White and Jonathan Keebler show off hail, a package to allow you to perform genomic analysis in Apache Spark:

One of the most important downstream analyses is finding genetic trait associations. Association studies look for statistical associations between genetic variation and phenotypic traits, that is, an observable characteristic of an individual, such as hair color or disease. With the increasing availability of whole-genome sequence data, it’s possible to look for variants from across the whole genome that may be associated with a disease, rather than heavily relying only on commonly known variants as in a traditional genome-wide association study (GWAS).

The challenge for downstream processing is scale. Tools that can cope with a few hundred or even a few thousand genomes, such as the well-known 1000 Genomes dataset, can’t handle datasets that are one or more orders of magnitude larger. These datasets are now becoming commonplace, thanks to the multiple sequencing efforts taking place around the world like the 100,000 Genomes Project in the UK and the Precision Medicine Initiative in the US.

Genomic analysis has been right in Hadoop’s wheelhouse for a while.

Grid Features In SQL Prompt

Derik Hammer shows off some of the grid functionality in Red Gate’s SQL Prompt:

Even more common than scripting out INSERT statements, I may need to copy a set of values and format them for an IN clause. Normally I would use a text editor such as Notepad++ to reformat the multiple lines of values. SSMS can also be used but I find Notepad++’s find/replace features better.

Now I do not have to worry about copying/pasting the values and making changes. SQL Prompt delivers a direct conversion from values to IN clause.

Click through for some animated GIFs showing how to use this functionality.

Foreign Key Check Options

Louis Davidson shows how to create a foreign key constraint which is enabled or disabled, trusted or untrusted:

I am in the middle of building a utility (for work, and for my next SQLBLOG post), that will help when you need to drop the foreign key constraints on a table if you want to truncate the tables, but holds the script in a table to replace the script.  The first thing though, is to make sure I have all of the scripting possibilities understood.

When I started hunting around to remember how to create a disabled constraint, I couldn’t easily find anything, so I figures I would make this a two-parter. (My blogging rule is if I look for something and find a good article about it, reference it, then tweet the article out. If it is too hard to find, blog about it!) So today I will review how to create a FOREIGN KEY constraint in three ways:

  • Enabled, and Trusted – Just as you would normally create one

  • Enabled, Not Trusted – The “quick” way, not checking data to see if any wrong data already exists, but not allowing new, bad data in

  • Disabled, Not Trusted – The constraint is basically documentation of the relationship, but you are on your own to make sure the data matches the constraint

In an ideal world, all of your constraints are enabled and trusted, but when you’re building a general-purpose script, you can’t always assume that will be the case.  Click through for examples on how to create foreign key constraints fitting each of these scenarios.

Dynamic Markdown YAML

Steph Locke shows how to use the params section of a YAML header to enable parameter reuse:

You may already know the trick about making the date dynamic to whatever date the report gets rendered on by using the inline R execution mode of rmarkdown to insert a value.

title: "My report"
date: "`r Sys.Date()`"
output: pdf_document

What you may not already know is that YAML fields get evaluated sequentially so you can use a value created further up in the params section, to use it later in the block.

Click through to see how it’s done.

Power BI On-Prem Details

Ginger Grant explains what’s going on with Power BI Premium and the on-prem offering:

It is not possible to run Power BI reports locally right now, but sometime before the 1st of July 2016,  users who have SQL Server 2016 Enterprise Edition per-core and active Software Assurance [SA] can deploy Power BI Report Server.  This means that no one is going to have to wait for SQL Server 2017 for Power BI on premise as that will be available sometime in June.  The functionality in SQL Server 2017 SQL Server Reporting Server [SSRS]. Community Technology Preview edition is going to be available in Power BI Report Server, with the addition of the ability to include custom visuals, which the CTP version did not do. The Power BI Server includes all of the functionality of SSRS This means that users will not need an SSRS Server and a Power BI Server, as the Power BI Server will be able to both.  If you want to migrate all of the reports created in SSRS from 2008 R2, and SSRS Mobile Reports, you can migrate these reports to the new Power BI Report Server, provided of course you have SQL Server 2016 Enterprise per-core edition with SA. The Power BI Report Server will be a separate install with separate release schedules.  Microsoft has announced that they are planning on doing updates at a greater frequency than SQL Server. Power BI Report Server will also be able to publish reports to mobile devices as well. If the reports uses data in the cloud, you can employ a Data Gateway as the Power BI Reporting Server can use the gateway to access cloud data. Of course if all of the data in the report is located on-premises, no gateway will be required.

I’m a bit disappointed that the on-prem installation will not allow you to create dashboards, but perhaps that will come in time.

Attaching A SQL Server Database To A Docker Container

Mat Hayward-Hill shows how to attach an existing MDF file to a SQL Server on Linux instance in Docker:

Now we are ready to attach the database using the TSQL below. For this demo, I used Management Studio from my Laptop, to connect to SQL Server.

In the TSQL we need to use the FOR ATTACH_REBUILD_LOG argument as we have no log file to attach. It will create a 1MB log file in the default log file directory.

It’s better to restore a full backup, but there’s more than one way to connect a database.

Azure SQL Data Warehouse Security

Grant Fritchey looks at what security measures are available within Azure SQL Data Warehouse:

Login Security

You have two core choices on logins. First, you have to create a SQL login at the server level for both Azure SQL Database and Azure SQL Data Warehouse. You can’t remove this or disable it (to my knowledge, and I’ve tried), so make the password a good one (and don’t lose it). You can then create other SQL logins, but this is not a recommended best practice. In fact, I wouldn’t do it at all unless I was forced because of some third party product (few of which currently support Azure anyway).

The next choice, the preferred choice, is to set up Azure Active Directory. With Azure AD you get all the functionality you’re used to with your local AD. Further, you can federate Azure AD with your local AD to control and manage the logins from within your network. You also get multi-factor authentication with Azure AD. We are talking real security here. Read through the documentation on setting up authentication to get it right. You can do the whole thing using Powershell commands, so there’s no excuse on automating it.

There aren’t as many security-related toggles as in an on-prem product, but Grant demonstrates what is available.

Installing Multiple SSRS Instances

Dave Mason explains how to set up multiple SQL Server Reporting Services installations to run against a single SQL Server instance:

Have you ever needed to install multiple instances of SSRS, with each instance “connected” to the same instance of the SQL Server database engine? (By “connected”, I mean that the pair of [ReportServer] databases for each SSRS instance would all reside on the same instance of SQL Server. And each SSRS instance would be reporting on data from one or more databases that also resided on the same instance of SQL Server.)

To my surprise, I don’t see much guidance for this scenario on the internet. TechNet has an article. It’s consistently one of the first search results I get back for variations of “Install multiple instances of SSRS”. That article (and a few others) omit a simple installation step/requirement that was a blind spot for me. (More on that towards the end.) I finally figured out what I was doing wrong and eventually succeeded with my task. Let’s walk through the steps.

I’m not quite positive what problem this best solves, but that could just be a lack of vision on my part.


May 2017
« Apr