Cloud – Page 132 – Curated SQL

Ingesting data into the Data Lake occurs in steps 1 and 2 in our architecture. Azure Data Factory (ADF) provides an excellent mechanism for loading data from source applications into a Data Lake stored in Azure Data Lake Store Gen2. In fact, Microsoft offers a template in the ADF Template gallery which provides a metadata driven approach for doing so. The template comes with a control table example in a SQL Server Database, a data source dataset and a data destination dataset. More on this template can be found here in the official documentation.

I appreciate that this is a full walkthrough of the process, not just one step.

Comments closed

Managed Instance Challenges

Published 2019-08-02 by Kevin Feasel

Joey D’Antoni has a few real-world challenges with migrating to Azure SQL Managed Instances:

While DMS is pretty interesting tooling, I had mostly ignored it until recently. Functionally, the tool works pretty well. The problem is it requires a lot of privileges–you have to have someone who can create a service principal and you need to have the following ports open between your source machine and your managed instance:
– 443
– 53
– 9354
– 445
– 12000
While the scope of those firewall rules is limited, in a larger enterprise, explaining why you need port 445 open to anything is going to be challenging.

The technology is intriguing, though it does seem like there are still some kinks to work out.

Comments closed

Using the Cosmos DB Change Feed

Published 2019-07-29 by Kevin Feasel

Hasan Savran shows us how we can use Cosmos DB’s change feed to track changes to documents in a container:

This is great but I want to do more than that. How am I going to access to changed data? What should I do if there is more than one change or insert? In my case, I need to access to SensorCode attribute so I can do something about this alert. To answer these questions, you need to know more about the Azure Functions. If you can see the number of modified documents by this code, that means you are done with Change Feed functionality. First, we need some kind of loop so if the code can process multiple changes. To do that, I will use a simple foreach loop.

The thing which comes to mind when I hear about Cosmos DB’s change feed is Kafka, with that immutable log of actions you can read through.

Comments closed

Keeping S3 and Blob Storage in Sync

Published 2019-07-23 by Kevin Feasel

Sheldon Hull shares with us a technique to keep an S3 bucket in sync with an Azure Blob Storage blob:

Moving data between two cloud providers can be painful, and require more provider scripting if doing api calls. For this, you can benefit from a tool that abstracts the calls into a seamless synchronization tool.
I’ve used RClone before when needing to deduplicate several terabytes of data in my own Google Drive, so I figured I’d see if it could help me sync up 25GB of json files from Azure to S3.

You’ll have to do a few of the steps on your own, but this looks like a good way of parking data in two clouds.

Comments closed

Adding Database to Azure SQL Elastic Pools

Published 2019-07-22 by Kevin Feasel

Arun Sirpal wants to shuffle deck chairs in Azure:

As you can see I have a standard elastic pool which has 7 databases within it. I also have CloudDB and CRMDB that are single databases but not yet part of my elastic pool. How do I move them into it?

Click through to learn how to do this.

Comments closed

Using AZCopy for SQL Backups

Published 2019-07-19 by Kevin Feasel

John McCormack shows how you can use AZCopy to move SQL Server backups into Azure Storage:

AZCopy is a useful command line utility for automating the copying of files and folders to Azure Storage Account containers. Specifically, I use AZCopy for SQL Backups but you can use AZCopy for copying most types of files to and from Azure.
In this blog post example (which mirrors a real world requirement I had), the situation is that whilst I need to write my SQL backups over a network share, I also want to push them up to Azure Storage (in a different region) to allow developers quicker downloads/restores. This is why I need to use AZCopy. If I only needed my backups to be written to Azure, I could have used BACKUP TO URL instead.

Read on to see how John did it.

Comments closed

Changes to Azure SQL Database SLA

Published 2019-07-19 by Kevin Feasel

Arun Sirpal notes a change to the Azure SQL Database Service Level Agreement:

I am sure many missed the updates to Azure SQL Database SLA (Service Level Agreement). It used to be 99.99% across all tiers but split between two different high-availability architectural models. Basic, Standard and General Purpose tiers had its own model and the Premium / Business Critical tiers had a different one.

Read on to see the change.

Comments closed

Oracle Data Guard on Azure

Published 2019-07-17 by Kevin Feasel

Kellyn Pot’vin-Gorman’s worlds continue to collide:

So, as most people know, I’m not a big fan of Oracle RAC, (Real Application Cluster). My opinion was that it was often sold for use cases that it doesn’t serve, (such as HA) and the resource demands between the nodes, as well as what happens when a node is evicted to those that are left are not in the best interest for most use cases. On the other hand, I LOVE Oracle Data Guard, active or standard, don’t matter, the product is great and it’s an awesome option for those migrating their Oracle databases to Azure VMs.

Read on to see what Oracle Data Guard is and where you might use it.

Comments closed

Notebooks in Azure Databricks

Published 2019-07-16 by Kevin Feasel

Brad Llewellyn takes us through Azure Databricks notebooks:

Azure Databricks Notebooks support four programming languages, Python, Scala, SQL and R. However, selecting a language in this drop-down doesn’t limit us to only using that language. Instead, it makes the default language of the notebook. Every code block in the notebook is run independently and we can manually specify the language for each code block.

Before we get to the actually coding, we need to attach our new notebook to an existing cluster. As we said, Notebooks are nothing more than an interface for interactive code. The processing is all done on the underlying cluster.

Read on to learn how Databricks uses the notebook metaphor heavily in how you interact with it.

Comments closed

Logging in Azure

Published 2019-07-16 by Kevin Feasel

Rolf Tesmer has a detailed post covering how and what to log when using Azure for a modern data warehouse:

In my view – what often doesn’t get enough attention up front are the critical aspects of monitoring, auditing and availability. Thankfully, these are generally not too difficult to plug-in at any point in the delivery cycle, but as like with most things in cloud there are just so many different options to consider!
So the purpose of this blog is to focus on the key areas of Azure Services Monitoring and Auditing for the Azure Modern Data Platform architecture.

Click through for examples from a number of different Azure services.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Cloud

Dimensional Load with Databricks

Managed Instance Challenges

Using the Cosmos DB Change Feed

Keeping S3 and Blob Storage in Sync

Adding Database to Azure SQL Elastic Pools

Using AZCopy for SQL Backups

Changes to Azure SQL Database SLA

Oracle Data Guard on Azure

Notebooks in Azure Databricks

Logging in Azure