Press "Enter" to skip to content

Month: August 2018

Azure Data Factor V2 Now In General Availability

Chris Seferlis covers some of the improvements in Azure Data Factory V2:

With ADF V2 you get a browser-based interface using drag and drop technology; V1 was primarily done in the Visual Studio IDE. It also added triggers for scheduling, so you can schedule your jobs when required and in additional ways (which I’ll discuss further in a bit).

Some other features of ADF V2 that came out as it became generally available:

  • Lift and Shift operations for your SSIS packages, so if you have SSIS packages local, you can now Lift and Shift those into compute with the integration runtime service in Data Factory.

  • This also allows for cloud to cloud, cloud to prem, prem to prem and some third-party tools are supported within that as well.

  • Control flow activities like link branching, looping, conditional execution and parameterization.

  • Integration with HD Spark and Databricks for big data workloads and data science.

There’s more where that came from, too.

Comments closed

Index Maintenance With Replication

Ajay Dwivedi shares his rules of thumb for index maintenance on replicated databases:

Like any other DBA, I fell into the trap of using straight maintenance solution using Reorganize operation for Indexes with avg fragmentation with 30% or less with Index Rebuild for avg fragmentation greater than 30%.

Well above approach works fine in common scenarios, but can create problems for servers using transaction log based High Availability technologies, such as AlwaysOn Availability Groups, database mirroring, log shipping, and replication. Both index rebuild and reorganize introduce heavy transaction log activity and generate a large number of log records. This becomes an issue in case of node failover, server with limited storage, database file with restricted growth, wrong file auto growth setting, or database with high VLF counts.

The best option for servers with High Availability is to identify kind of server workload (OLTP/OLAP/mixed), fill factor (based on Page Splits/sec), fragmentation, underlying storage load (random/sequential), Index Scans vs Index Searches, job time frame (low activity outside business hours) etc. After calculating all the above factors, all we need is to have a robust Index Maintenance solution. This is where I find Ola Hallengren’s SQL Server Maintenance Solution a perfect fit.

Ajay uses Ola Hallengren’s solution and gives us the breakdown percentages he uses.

Comments closed

SQL Server On Containers Q&A

Anthony Nocentino walks us through questions he received after his PASS marathon session on SQL Server and containers:

  • How do SQL Containers work with High Availability and Disaster Recovery?

  • Backups and data persistency are primary concerns here. You still need to care and feed for your SQL Server databases just as if they were platformed on a full operating system. For HA, Microsoft has some guidance on how to use Kubernetes to provide HA services to your SQL Server containers here. What I want you to think about when using containers for SQL Server is deploying a new container is VERY fast. We want to be able to persist the data and be able to stand up a new container and mount our data inside that container. Using this technique we can restore SQL Services very quickly with low RTO. That itself is an interesting way to provide HA services without any additional technologies.

Read on for more Q&A and check out Anthony’s presentation.

Comments closed

Nested Calculations In Power Query

Michael Humpherys shows how to use nested calculations in Power Query to make financial calculations easier:

A central problem in finance is answering the simple question: How much is this contract worth? For example, Bob might say he’ll give me $102 in a year, and I want to know how much I should pay him for that guaranteed money. If I figure out that the value of the contract is a $100, then I’m saying that the guaranteed $102 in a year is worth $100 today. This means I get a 2% interest on my $100 investment. This is called the one-year spot rate, and there are similar rates for all sort of different time frames. Taking 1/ (1+.02) gives me the discount rate and multiplying this by the $102 payment gets me to the $100 value of the contract.

The next step is that I may want to know how much $102 two years from now is worth next year. So instead of figuring out what it is worth today, I want to know what it will be worth in a year. To figure this out, I need something called the forward rate, which tells me the annual interest rate one year in the future, in this example.

With the forward rates, I can take a complex series of future payments and find the value of all of those payments today, but also the value at different points in the future. The complexity is that depending on when I want to value them to and the timing of the payment I need to use different sets of forward rates and that’s the application I’ll walk through below.

That is a novel use of the “table in a cell” technique.

Comments closed

The Corruption Scenario That Wasn’t

Andy Galbraith walks us through a scary-looking error message that turned out to be not quite accurate:

There were a stack of errors overnight in the DB123 database on SQL01, including one horror show error:

Log Name:      Application

Source:        MSSQLSERVER

Date:          5/24/2018 12:49:39 AM

Event ID:      3314

Task Category: Server

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      SQL01.client99.com

Description:

During undoing of a logged operation in database ‘DB123’, an error occurred at log record ID (195237:96178:4). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.

The catch is that when I checked the server, the database was online.

Say what?

Read on for root cause analysis and issue correction.  This is a good example where jumping to conclusions could lead you down the wrong path and miss the root cause entirely.

Comments closed

Overview: U-SQL Database Projects

Zach Stagers gives us an overview of the new U-SQL Database Project structure:

Source Control

The projects integrates much more nicely with TFS than the older “U-SQL Project” does.

It actually gives you the icons (padlock, check mark, etc..) in the solution explorer, so it actually looks like it’s under source control!

Something that I’d really hoped had been fixed, but hasn’t, is when copying and renaming an existing item, it doesn’t recognize the rename. You have to undo the checkout of the non-existent object (the copy, before being renamed):

Read on for more improvements.

Comments closed

The Intersection Of Multiple Averages Is The Empty Set

Adi Gaskell argues that we shouldn’t get too wrapped up in “average” behaviors:

I’ve written extensively about the tremendous potential for big data in healthcare to drive enormous changes in how we keep people healthy for longer. It goes without saying however that all data is not created equal, and just having a large sample is not always sufficient to get the best insights.

If we needed reminding, a reminder comes via a recent study from the University of California, Berkeley. It suggests that things like emotion, behavior, and physiology vary hugely between individuals, therefore having an average over a large dataset can still produce a ‘norm’ that is wide of the mark for individuals.

“If you want to know what individuals feel or how they become sick, you have to conduct research on individuals, not on groups,” the researchers say. “Diseases, mental disorders, emotions, and behaviors are expressed within individual people, over time. A snapshot of many people at one moment in time can’t capture these phenomena.”

Variance is important.

Comments closed

Building An Extended Events Session

Aamir Syed gives us a simple example of using the Extended Events UI to create a new session:

Many of us have not made the effort to switch from profiler to Extended events.  It’s 2018, if you haven’t found a few hours to learn about this incredibly powerful tool, I urge you to do so now.

I’m going to provide a quick means of tracking queries with extended events. This is not an example of how comprehensive this is, but I hope that it atleast spurs some interest.

One of the main reasons we use profiler is to quickly capture some real time data. I’m going to not only show you how to do that with extended events, but this same session can be a historical view as it’s so easy to sift through and filter through the data. (No you don’t have to create a table for the result sets ala profiler).

Click through for step-by-step instructions.

Comments closed

Azure SQL Database Service By Purchase Model

Glenn Berry explains the two purchase models available with Azure SQL Database, as well as the various service tiers within each model:

The older pricing option is the DTU-based SQL purchase model, where a fixed set of resources is assigned to the database from three performance tiers, which are Basic, Standard, and Premium.

For Standard and Premium, there are multiple service tiers, which are classified according to how many Database Transaction Units (DTUs) they provide (along with their included storage and maximum available storage). The Premium tier is designed for I/O intensive workloads, and is fault-tolerant.

The Database Transaction Unit (DTU) is based on a blended measure of CPU, memory, along with storage reads and writes. The DTU-based performance levels represent preconfigured bundles of compute, memory, and storage resources designed to drive different levels of application performance. If you do not want to worry about the underlying resources and prefer the simplicity of a preconfigured resource bundle while paying a fixed amount each month, you may find the DTU-based model more suitable for your needs and easier to understand.

Glenn does a good job clearing up some of the complications around pricing for Azure SQL Database.

Comments closed