Understanding Estimation Variance

Aaron Bertrand discusses drift between estimates and actuals in execution plans:

Now, inaccurate estimates won’t always be a problem, but it can cause issues with inefficient plan choices at the two extremes. A single plan might not be optimal when the chosen range will yield a very small or very large percentage of the table or index, and this can get very hard for SQL Server to predict when the data distribution is uneven. Joseph Sack outlined the more typical things bad estimates can affect in his post, “Ten Common Threats to Execution Plan Quality:”

Read the whole thing.

Azure Automation

Steph Locke looks at Azure Automation:

Azure Automation is essentially a hosted PowerShell script execution service. It seems to be aimed primarily at managing Azure resources, particularly via Desired State Configurations.

It is, however, a general PowerShell powerhouse, with scheduling capabilities and a bunch of useful features for the safe storage of credentials etc. This makes it an excellent tool if you’re looking to do something with PowerShell on a regular basis and need to interact with Azure.

Read the whole thing.

SSIS Stream Reading And Writing

Jorge Novo has a Script task to read from a file as a stream and write to a different file as a stream:

How many times you need to stream reading from a file and writing into another well here it is…Here we are using special character char(30) and char(31) as delimited so a quick replace using regex did it. It’s Simple,Quick, and Dirty 🙂

The last ten lines are the important ones.  You’d use this stream read+write pattern when dealing with files larger than your total amount of available RAM.

Hadoop For .NET Developers

Kevin Feasel

2016-07-12

Hadoop

Elton Stoneman is working on a new Pluralsight course:

Actually I’m talking about .NET Core. That Docker image is built on Debian Jessie, and it installs Hadoop and .NET Core. You can run compiled .NET Core assemblies just by running dotnet /path/to/my/assembly.dll. The same image can run as Hadoop master and worker nodes, so all workers have .NET Core installed and can run .NET Core apps.

Which means you can run .NET Core apps as mappers or reducers in Hadoop jobs, thanks to theHadoop Streaming interface. Hadoop Streaming lets you shell out to any program the host can execute, to run the tasks for your job. Your code integrates with Hadoop at a basic level – through the standard input and output streams.

Elton has some good courses on other Hadoop-and-.NET topics (like his HDInsight deep dive course).  It looks like running containerized Hadoop + .NET Core is getting closer, but still has room for improvement.

Dealing With Azure Service Disruptions

Vincent-Philippe Lauzon discusses various potential solutions to Azure service disruptions:

What does happen in Azure is Service disruption.  Sometimes those are due to some hardware failures, but most of the time, they are a software problem:  an upgrade in Azure software (Microsoft’s software) gone wrong.  They happen occasionally, are typically short lived but if business continuity is a hard requirement, those are the ones you should protect your solution against.

In this article I’ll cover resilience for virtual machines.  I’ll assume that you are usingAzure Traffic Manager or other DNS service to fail over the traffic from a primary to a secondary region.

He covers “stateless” VMs (like web servers or app servers) as well as “stateful” VMs (like database servers) and explains disaster recovery options for each.

DBCC CLONEDATABASE

Kennie Pontoppidan talks about a new DBCC command:

This command enables you to take a copy of the schema and the statistics for an existing database into a new database. According to the documentation, DBCC CLONEDATABASE

  • Creates a new destination database that uses the same file layout as the source but with default file sizes as the model database.

  • Creates an internal snapshot of the source database.

  • Copies the system metadata from the source to the destination database.

  • Copies all schema for all objects from the source to the destination database.

  • Copies statistics for all indexes from the source to the destination database.

I knew there were ways of scripting out statistics, but DBCC CLONEDATABASE looks like a new and interesting beast.

Database Mail Requires TLS 1.0

Ryan Adams discovered that Database Mail cannot use TLS 1.2 at this time:

You may recall something called the POODLE attack that revealed a vulnerability in SSL 3.0 and TLS 1.0.  This particular server had SSL 3.0, TLS 1.0, and TLS 1.1 disabled in the registry.  Also note that TLS 1.2 was NOT disabled.  The server was running Windows 2012 R2.  These protocols were disabled to prevent the possibility of a POODLE attack.  If you are wondering how to disable these protocols on your servers then look at Microsoft Security Advisory 3009008.  To disable them for the whole OS scroll down to the Suggested Actions section and look under the heading “Disable SSL 3.0 in Windows For Server Software”.

I also want to note that the PCI Security Standards Council pushed back the date for getting off of SSL and TLS 1.0 to June 30th, 2018.  In addition to that, it should also be noted that Microsoft’s Schannel implementation of TLS 1.0 is patched against all known vulnerabilities.

The root cause is interesting:  it’s because Database Mail requires .NET Framework 3.5.  Ryan has more details, including a fix, so read on.

SSIS Deployment Models

Ginger Grant argues in favor of the project deployment model in SSIS:

Prior to SQL Server 2012, there was no project deployment. SSIS code was all deployed as packages. These packages could be stored within MSDB or they could be stored and run from the file system. In disorganized places like the one where I worked, they were deployed in both. Assuming nothing much changed since I left, they have all versions of SQL Server which were released prior to the day the new IT Director started in 2012. There was DTS on SQL Server 97, 2000 and SSIS on 2005, 2008 and 2008 R2. No reason to upgrade anything which still worked was their motto. When space was a problem, one could always go build another server. I think the LAN administrator was happiest when he was able to justify building a new server as he could spend hours shopping for parts on the internet and building the latest server.

I believe that, in the business, we call this a “nightmare scenario.”  Read on for ways in which a project model would help.

Introduction To R

Kevin Feasel

2016-07-12

R

Paul Hernandez has an introduction to using the R client and RODBC to connect to SQL Server:

The first step is to load the RevoScaleR library. This is an amazing library that allows to create scalable and performant applications with R.

Then a connection string is defined, in my case using Windows Authentication. If you want to use SQL Server authentication the user name and password are needed.

We define a local folder as the compute context.

RxInSQLServer: generates a SQL Server compute context using SQL Server R Services –documentation

Sample query: I already prepared the dataset in the view, this is a best practice in order to reduce the size of the query in the R code and for me is also easier to maintain.

I think there’s a lot of value in learning R, regardless of whether you have “data analyst” in your role or job title.

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031