Press "Enter" to skip to content

Author: Kevin Feasel

Migrating To Azure SQL Database

John Sterrett discusses ways to migrate an on-prem database to Azure SQL Database:

This is great, except for the case I want to talk about today. What if you need to do a live migration with as little downtime (business wants no downtime) as possible with bigger databases. For example, say and existing 50 GB to 500GB database? Your only, option today is a good old friend of mine called transactional replication. You see, you can configure transactional replication and have the snapshot occur and all data in your current production system can be syncing live with your Azure SQL Database until it’s time to cutover which will make your cutover downtime as short as possible.

Below I will give you step by step instructions on how you can configure your subscriber. This would be your Azure SQL Database. The publisher would be your existing production database which could either be on-premise or an Azure VM.

Ah, replication:  the cause of, and solution to, all of life’s problems.  Or something.  Do read the whole thing.

Comments closed

Azure Iaas Disk Differences

Rolf Tesmer looks at whether to use SSDs or Premium Disk for SQL Server instances on Azure IaaS:

To test the throughput I will run a set test with the TempDB database on D:\ (local SSD) and then rerun the test again with the TempDB moved onto F:\ (P30 premium disk).  Between the tests SQL Server is restarted so we’re starting with a clean cache and state.

The test SQL script will create a temporary table and then run a series of insert, update, select and delete queries against that table.  We’ll then capture and record the statistics and time.

The results were interesting; read on to learn more.

Comments closed

Manning’s Equation

John Yagecic has a Shiny app which gives a Monte Carlo analysis of Manning’s Equation:

Monte Carlo analysis is a great way to explore the impact of input variable uncertainty on the results of engineering equations, and with vector variables and distribution and sampling functions at its core, R is a natural platform for this analysis.

Check out his app, which has a link to the code.  Amazingly, this is only 107 lines of code.

Comments closed

Intro To Data Analytics

Stacia Varga has a follow-up to her Gentle Introduction to Data Analytics:

Q: How is data analytics on SAS different from R support for SQL2016?

A: I’ve not used SAS (although I did go to their campus to teach MDX to their engineers once upon a time… beautiful campus!), so I can’t say with any specificity. SAS also has multiple products so it would be difficult to describe differences. In general – and please understand there may be something new in SAS that I don’t know about – I think the difference is that R in SQL 2016 allows you to run functions on data inside the database, thereby leveraging database resources and also server resources for parallelization. I’d love to get input from readers to expand on this topic.

I can confirm that SAS has a beautiful campus.

Comments closed

Logging WhoIsActive Output

Tara Kizer has a primer on storing WhoIsActive outputs for subsequent analysis:

Create a new job and plop the below code into the job step, modifying the first 3 variables as needed. The code will create the logging table if it doesn’t exist, the clustered index if it doesn’t exist, log current activity and purge older data based on the @retention variable.

How often should you collect activity? I think collecting sp_WhoIsActive data every 30-60 seconds is a good balance between logging enough activity to troubleshoot production problems and the storage needed to keep the data in a very busy environment.

I like having something like this in place because often times, when you need these results, it’s already too late.

Comments closed

Enterprise R Security

Ramkumar Chandrasekeran discusses DeployR, an enterprise security model for R:

DeployR Enterprise is designed to deliver analytics solutions at scale to whomever needs it: inside or outside the enterprise. It also guarantees secure delivery of your analytics via DeployR web services. These secure web services integrate seamlessly with existing enterprise security solutions: Single Sign-On, LDAP, Active Directory, PAM, and Basic Authentication, can enforce access privileges already defined by your IT department for existing enterprise users and also have the capability to safely support anonymous users when needed.

There’s nothing groundbreaking here:  it’s TLS (to encrypt network transmissions) and LDAPS (to control authentication and authorization).  That there’s nothing groundbreaking is a good thing—that means companies will have most of the infrastructure in place to support this.

Comments closed

SQL Server 2016 CU1 Bugfixes

Andrew Pruski notes an important bugfix in SQL Server 2016 CU1:

SQL Server 2016 CU1 has been released and one thing I noticed was: –

FIX: Canceling a backup task crashes SQL Server 2014 or 2016

That’s pretty nasty, when I originally clicked on the link I was expecting to see detailed a pretty precise set of circumstances in which that bug can occur but no no, apparently not. Cancelling any backup task can lead to this happening.

Andrew then argues in favor of waiting for SPs before deploying new versions of software, having been burned on it in the past.  I don’t agree with that philosophy; regardless, I recommend reading his post.

Comments closed

Traveling Salesmen

Jesse Clark explains the traveling salesman problem:

One of the canonical questions in operations is the traveling salesman problem (TSP). In its simplest form, we have a busy salesperson who must visit a set number of locations once. Time is money, so the salesperson wants to choose a route that minimizes the total distance traveled. It is not so hard to imagine these path optimization problems occurring within warehouses where people (‘pickers’) need to navigate aisles and fill orders as they go.

The Traveling Salesman Problem is a computer science classic and acts as a classic graph optimization problem.  Check this post out for more details.

Comments closed

Netflix Billing

Subir Parulekar and Rahul Pilani describe how they moved their billing data out of a data center and into AWS:

Now the only (and most important) thing remaining in the Data Center was the Oracle database. The dataset that remained in Oracle was highly relational and we did not feel it to be a good idea to model it to a NoSQL-esque paradigm. It was not possible to structure this data as a single column family as we had done with the customer-facing subscription data. So we evaluated Oracle and Aurora RDS as possible options. Licensing costs for Oracle as a Cloud database and Aurora still being in Beta didn’t help make the case for either of them.
 
While the Billing team was busy in the first two acts, our Cloud Database Engineering team was working on creating the infrastructure to migrate billing data to MySQL instances on EC2. By the time we started Act III, the database infrastructure pieces were ready, thanks to their help. We had to convert our batch application code base to be MySQL-compliant since some of the applications used plain jdbc without any ORM. We also got rid of a lot of the legacy pl-sql code and rewrote that logic in the application, stripping off dead code when possible.
Our database architecture now consists of a MySQL master database deployed on EC2 instances in one of the AWS regions. We have a Disaster Recovery DB that gets replicated from the master and will be promoted to master if the master goes down. And we have slaves in the other AWS regions for read only access to applications.

Read the whole thing.  Their architectural requirements probably won’t be yours (unless you’re working at a company at the scale of Netflix), but it’s quite interesting seeing how they solve their problems.

Comments closed