Press "Enter" to skip to content

Author: Kevin Feasel

Data File Migration With Minimal Downtime

Nate Johnson weaves a yarn around moving from one storage system to another with minimal downtime:

Our ERP database has been chosen by the IT gods to get moved to the shiny new flash storage array, off the old spinning-rust SAN.  This is fantastic news for the business users.  But lo, the executives warn us, “You must do this with no downtime!” (said in my best Brent Ozar PHB-imitation voice).  Of course when we tell them that’s impossible, they say, “OK, you must do this with minimal downtime.”  That’s mo’ betta’.

So what are our typical options for doing a database migration?  Or, more specifically, a data file migration.  See, we’re not moving to a new server, and we’re not moving a bunch of databases together; we’re just moving this one ERP database.  And we’re keeping it on the same SQL instance, we’re just swapping the storage underneath.

Click through for some discussion on options, followed by implementation of a particular strategy.

1 Comment

A Hack For Dynamic ML Services Result Sets

Dave Mason has put together a solution to his dynamic data frame naming problem:

We can take those names and R types, string them together, and “convert” them to SQL data types. (Mapping data types from one language to another is waaaay outside the scope of this post. Lines 11-13 are quick and dirty, just for demonstration purposes. Okie dokie?)

It’s certainly a less than ideal solution, but it does the job.  And I hope as well that someday this functionality ends up being something built into the product.

Comments closed

Decrypting Always Encrypted Columns In SSMS

Monica Rathbun shows how to view Always Encrypted data within Management Studio:

Viewing decrypted data within SQL Server Management Studio (SSMS) is very easy. SSMS uses .NET 4.6 and the modern SQL Server client, so you can pass in the necessary encryption options. SSMS uses the connection string to access the Master Key and return the data in its decrypted format.

First create a new SQL Connection and Click Options to expand the window.

Then go to the Additional Connections Parameters Tab of the login window and simply type column encryption setting = enabled. Then choose Connect.

Click through to see the whole demo.

Comments closed

String Formatting With Powershell

Thomas Raynor provides a quick tip for string formatting in Powershell:

And, for some reason, instead of the default output which is formatted like a table, I want output presented like this.

This is a silly example, but notice that even though there are extensions of varying length (.ps1 and .dll are four characters including the dot, and .xlsx is five), all of the “file extension: <number>” is aligned.

How’d I do that?

Read on to learn how.

Comments closed

Blockchain For Business Notes

Allison Tharp has some notes on an edX course entitled Blockchain for Business.  This looks like it will be a multi-part series.  Part one:

distributed ledger is a data structure that is spread across multiple computers (which are usually spread across locations or regions).  Distributed ledger technologies have three basic components:

  • A data model to capture the current state of the ledger
  • A language of transactions to track the changes in the ledger state
  • A protocol that builds consensus among participants around which the transactions can be accepted

In other words, we can think of a distributed ledgers as databases which are shared among peers and do not rely on any central authority or intermediary.  Instead of having a central database, every participant has their own copy which stays in sync via the pre-established protocol.  Each participant verifies transactions and speaks a common language to ensure universal agreement on the state of the ledger.

Part two:

Another consensus algorithm is called the Proof of Stake algorithm.  With this algorithm, the nodes are known as validators and instead of mining the blockchain, they validate the transactions to earn a transaction fee.  Instead of creating new coins (as is the case in Bitcoin), all of the coins exist from the very beginning.  Another way to look at this is that the nodes are randomly selected to validate blocks.  The likelihood of the random selection will depend on how many coins the node holds (this is known as the amount of stake they hold).

Blockchain has gone from wacky idea to interesting business concept over the course of about a decade.  It’ll be interesting to see if it catches on to be a vital business concept in the next ten years.

Comments closed

Avro Schemas In Kafka

Stephane Maarek explains the value of using Apache Avro as a schema structure for your Kafka topics:

  • Avro has support for primitive types ( intstringlongbytes, etc…), complex types (enumarraysunions, optionals), logical types (datestimestamp-millisdecimal), and data record (name and namespace). All the types you’ll ever need.

  • Avro has support for embedded documentation. Although documentation is optional, in my workflow I will reject any Avro Schema PR (pull request) that does not document every single field, even if obvious. By embedding documentation in the schema, you reduce data interpretation misunderstandings, you allow other teams to know about your data without searching a wiki, and you allow your devs to document your schema where they define it. It’s a win-win for everyone.

  • Avro schemas are defined using JSON. Because every developer knows or can easily learn JSON, there’s a very low barrier to entry

Read on for more about Avro as well as the possibilities of using other techniques for defining schemas in Kafka.

Comments closed

When Spark Meets Hive

Anna Martin and Rosaria Silipo look at combining HiveQL and SparkQL:

We set our goal here to investigate the age distribution of Maine residents, men and women, using SQL queries. But the question is… on Apache Hive or on Apache Spark? Well, why not both? We could use SparkSQL to extract men’s age distribution and HiveQL to extract women’s age distribution. We could then compare the two distributions and see if they show any difference.

But the main question, as usual, is: Will SparkSQL queries and HiveQL queries blend?

Topic: Age distribution for men and women in the U.S. state of Maine.

Challenge: Blend results from Hive SQL and Spark SQL queries.

Access mode: Apache Spark and Apache Hive nodes for SQL processing.

Using KNIME, the authors are able to blend together data from different sources.

Comments closed

Troubleshooting Polybase Installation Errors

John Paul Cook has an article looking at what to do when you have Oracle JRE 9 and want to use Polybase:

This article shows you how to troubleshoot a failed installation of SQL Server and how to implement a workaround to allow SQL Server 2017’s PolyBase feature to be installed when version 9 of the Java Runtime Environment (JRE) is present. An installation of all features in SQL Server 2017 has three external dependencies. Python, R, and the JRE are third party or open source software needed in a full installation. Changes to external software after the release of SQL Server 2017 can introduce breaking changes. Oracle, the company that owns Java, changed how Windows registry keys are named. This caused a breaking change for SQL Server 2017. Version 8 of the JRE is compatible with the SQL Server 2017 installer. Version 9 of the JRE is not. If version 9 of the JRE is the only version of the JRE on a Windows machine, it is not possible to install the PolyBase feature. The JRE version bug also is found in the SQL Server 2016 installer. The same workaround works for both SQL Server 2016 and 2017.

Good article.

Comments closed

Using mssql-cli

Alan Yu announces mssql-cli, a command-line interface for SQL Server:

Mssql-cli is a new and interactive command line tool that provides the following key enhancements over sqlcmd in the Terminal environment:

  • T-SQL IntelliSense
  • Syntax highlighting
  • Pretty formatting for query results, including Vertical Format
  • Multi-line edit mode
  • Configuration file support

Mssql-cli aims to offer an improved interactive command line experience for T-SQL. It is fully open source under the BSD-3 license, and a contribution to the dbcli organization, an open source suite of interactive CLI tools for relational databases including SQL Server, PostgresSQL, and MySQL. The command-line UI is written in Python and the tool leverages the same microservice backend (sqltoolsservice) that powers the VS Code SQL extension, SQL Operations Studio, and the other Python CLI tool we announced earlier, mssql-scripter.

Something very cool that Alan points out is that this is “the first time our team is contributing source code to an existing open source organization with a commitment to be a good citizen in an existing open source community.”

Comments closed

The Power BI Gateway

Reza Rad explains how the Power BI gateway works:

You don’t need a gateway in all scenarios. Only if the data source is located on-premises, you need a gateway. For online or cloud-based data sources, no gateway is required. For example; if you are getting data from CRM Online, you don’t need a gateway. However, if you are getting data from SQL Server database located on your local domain server, then you need a gateway. For Azure SQL DB you don’t need a gateway. However, a SQL Server database located on Azure Virtual Machine is considered as on-premises and needs gateway.

This post could not have come at a better time for me, so I’m definitely happy to see it.

Comments closed