Press "Enter" to skip to content

Category: Cloud

Google Compute Engine Whitepapers

Brent Ozar Unlimited has a couple whitepapers out about working with SQL Server in Google Compute Engine.  First, Brent and Tara Kizer create an Availability Group:

In this white paper we built with Google, we’ll show you:

  • How to build your first Availability Group in Google Compute Engine

  • How to test your work with four failure simulations

  • How to tell whether your databases will work well in GCE

Erik Darling also has a whitepaper on performance tuning:

Relax. Have a drink. In this white paper we built with Google, we’ll show you:

  • How to measure your current SQL Server using data you’ve already got

  • How to size a SQL Server in Google Compute Engine to perform similarly

  • After migration to GCE, how to measure your server’s bottleneck

  • How to tweak your SQL Server based on the performance metrics you’re seeing

If you’re looking at GCE as a potential migratory spot, you’ve got some extra reading material.

Comments closed

Copying Azure SQL Databases Between Subscriptions

Arun Sirpal shows that it’s pretty easy to copy an Azure SQL Database from one subscription to another:

If you ever need to move a copy of a  SQL database in Azure across servers then here is a quick easy way.

So let’s say you need to take a copy of database called [Rack] within Subscription A that is on server ABCSQL1 and name it database [NewRack] within subscription B on server called RBARSQL1 (The SQL Servers are in totally different data centers too).

Read on for the answer.

Comments closed

Using Azure Data Factory With Biml

Meagan Longoria has a multi-part series on using Biml to script Azure Data Factory tasks to migrate data from an on-prem SQL Server instance to Azure Data Lake Store.  Here’s part 1:

My Azure Data Factory is made up of the following components:

  • Gateway – Allows ADF to retrieve data from an on premises data source

  • Linked Services – define the connection string and other connection properties for each source and destination

  • Datasets – Define a pointer to the data you want to process, sometimes defining the schema of the input and output data

  • Pipelines – combine the data sets and activities and define an execution schedule

Click through for the Biml.

Comments closed

Amit Kulkarni shows how to install Azure Data Lake Store support on your “older” Hadoop clusters:

How old is really old?

The Azure Data Lake Store binaries have been broadly certified for Hadoop distributions after 3.0 and above. We are really in uncharted territory for lower versions. So the farther away you go from 3.0 the higher the likelihood of them not working. My personal recommendation is to go no lower than 2.6. After that your mileage may really vary.

This is a good article, and do check it out.  A very small mini-rant follows:  Hadoop version 2.6 is not old.  Nor is 2.7.  2.7 is the most recent production-worthy branch and 3.0 isn’t expected to go GA until August.

Comments closed

Serverless Azure

Christos Matskas has an article on Azure Functions, Service Fabric, and Batch:

This service is the hidden gem of HPC (high performance computing) within the Azure Compute service family. As the name implies, Azure Batch is designed to run large-scale and high-performance computing applications efficiently in the cloud. When you’re faced with large workloads, all you have to do is to use Azure Batch to define compute resources to execute your applications in parallel and at the desired scale. A good use-case for Azure Batch would be to perform financial risk modelling, climate data analysis or stress testing. What makes Batch so useful is the fact that you don’t need to manually manage the node cluster, virtual networks or scheduling because all this is handled by the service. You need to define a job, any associated data and the number of nodes you want to utilise. It makes no difference if you need to run on one, a hundred or even thousands of nodes. The service is designed to scale according to the workload needs.

The cheapest server may very well be no server, and we’re at the point where relatively simple services could just run as Azure Functions or AWS Lambda functions.

Comments closed

The Cloud DBA

Kendra Little thinks about the evolution of the DBA role:

Lots of things have been reported to kill the DBA over the years

SQL Server 2005 was said to be “self-tuning”! Who needs a DBA when the instance tunes itself? (Apparently everyone.)

Outsourcing: All the DBA jobs are going to X location, then Y location, then Z location. Then back to X. DBA jobs have become more global, but “outsourcing” hasn’t gotten rid of DBA jobs in the United States. It has been part of the trend to make working remotely more normal and easy, which is generally good for DBAs.

DevOps! All the developers will manage everything. And somehow know to do so.  I love Dev Ops, and I have seen it wipe out some QA departments, but I haven’t seen it wipe out DBAs. I think it’s fun to be a DBA working with a Dev Ops team.

Consider this in contrast to Dave Mason’s concern.  My perspective is a lot closer to Kendra’s, but both posts make the good point that IT roles are ever-shifting.

Comments closed

Analyzing Flight Data With Sparklyr

Aki Ariga continues his sparklyr series with some analysis of US flight data:

In this post, we will show you a visualization and build a predictive model of US flights with sparklyr. Flight visualization code is based on this article.

This post assumes you already have the following tables:

You should make these tables available through Apache Hive or Apache Impala (incubating) with Hue.

There’s some setup work to get this going, but getting a handle on sparklyr looks to be a good idea if you’re in the analytics space.

Comments closed

Azure Container Service Supports Kubernetes

Serdar Yegulalp reports that Azure Container Service now supports the Kubernetes container management system:

Microsoft emphasized “choice” when it originally introduced Azure Container Service. Although it launched without Kubernetes, Azure initially supported Mesosphere DC/OS and Docker Swarm because the majority of Microsoft’s customers used them and the company believed they would be well served by the support.

Since then, Kubernetes has emerged as a clear leader among container orchestration solutions. It is used as an underpinning for deep learning frameworks and the basis for an open source serverless/“lambda” app framework, as well as offered as a managed on-premise service by one company.

Kubernetes on Azure is strictly focused on running Kubernetes within Azure, not providing it as a service elsewhere. But the GA release includes additions meant to appeal to a broad audience of both Linux and Windows Server users, such as support for the latest version of DC/OS (1.8.8).

It’s an interesting world out there.

Comments closed

Getting Started With Azure Cognitive Services

Rolf Tesmer has a demo app showing what Azure Cognitive Services Text Analytics can do:

Each execution of the application on any input file will generate 3 text output files with the results of the assessment.  The application runs at a rate of about 1-2 calls per second (the max send rate cannot exceed 100/min as this is the API limit).

  • File 1 [AzureTextAPI_SentimentText_YYYYMMDDHHMMSS.txt] – the sentiment score between 0 and 1 for each individual line in the Source Text File.  The entire line in the file is graded as a single data point.  0 is negative, 1 is positive.

  • File 2 [AzureTextAPI_SentenceText_YYYYMMDDHHMMSS.txt] – if the “Split Document into Sentences” option was selected then this contains each individual sentence in each individual line with the sentiment score of that sentence between 0 and 1.  0 is negative, 1 is positive.  RegEx is used to split the line into sentences.

  • File 3 [AzureTextAPI_KeyPhrasesText_YYYYMMDDHHMMSS.txt] – the key phrases identified within the text on each individual line in the Source Text File.

Rolf has also put his code on GitHub, so read on and check out his repo.

Comments closed