Press "Enter" to skip to content

Category: Cloud

A Review Of AWS Managed Kafka Public Preview

Stephane Maarek is not impressed with AWS’s managed Kafka offering so far:

For me, the more people use Apache Kafka, the more business I get. As I teach Apache Kafka online on Udemy (links at https://kafka-tutorials.com/), the prospect of having an entire user base from AWS wanting to learn Apache Kafka is exciting! And as an Apache Kafka consultant, it’s always more fun to spend time deploying data pipelines than deploying infrastructure.

Unfortunately what AWS released today misses the mark. I think it’s reminiscent of managed services of open source software in AWS overall: they’re released early and lack features that I think should be MVP. In my opinion this will deter future users.

Based on Stephane’s reading, this is a product which should have sat in development for another 3-6 months to flesh out the features, upgrade the version of Kafka used, etc.  Definitely read this before jumping on AWS MSK.

Comments closed

Working With Firewall Rules From Azure SQL Database

Arun Sirpal shows us how we can use T-SQL to set and modify firewall rules within Azure SQL Database:

For this post I want to actually show you the TSQL code to do this, hopefully it will become a good reference point for the future. Before we step into the code lets understand the differences between database level and server level rules.

For server level rules they enable access your entire Azure SQL server, that is, all the databases within the same logical server. These rules are stored in the master database. Database level rules enable access to certain databases (yes you could also run this within master) within the same logical server, think of this as you being more granular with the access where they are created within the user database in question.

Personally, I try and always use database level rules, this is especially true when I work with failover groups.

Click through for instructions on how to work with both server and database level rules.

Comments closed

AWS Timestream: A Time-Series Database

Alex Woodie reports on a new entrant in the time-series database market:

Time-series databases have emerged as a best-in-class approach for storing and analyzing huge amounts of data generated by users and IoT devices. While relational and NoSQL databases are sometimes used for time-stamped and time-series data – such as clickstream data from Web and mobile devices, log data from IT gear, and data generated by industrial machinery — today’s massive data volumes from the IoT have outstripped the capability of those databases to keep up.

As the high-end time-series use cases piled up, AWS decided it was time to take action and make its entry into the still-specialized field, much as it did with last year’s launch of Neptune, a graph database, which is another specialized database field that’s emerging.

And here I was, just learning a bit about InfluxDB from Tracy Boggiano’s work.

Comments closed

Azure Price Increases In CAD

Randolph West notes that Azure services priced in Canadian dollars will increase by five percent:

Starting December 1, 2018, prices for Azure services in the Canadian dollar will increase by 5 percent to more closely align to Azure pricing in US dollars. Even after this adjustment, customers buying in the Canadian dollar will continue to find Azure offerings highly competitive.

Microsoft periodically assesses its pricing of products and services across the globe to ensure reasonable alignment across regions. This change to Azure prices is an outcome of this assessment.

The first thing that came to mind was a particular joke from the Simpsons.  For those who don’t remember, the Simpsons was a hilarious cartoon for about ten years before it was quietly killed and replaced with something almost but not quite the same, lacking most of the humor.

Comments closed

Azure SQL Database Supports R Integration

David Smith notes that Azure SQL Database now has (in preview) support for R:

Azure SQL Database, the database-as-a-service based on Microsoft SQL Server, now offers R integration. (The service is currently in preview; details on how to sign up for the preview are provided in that link.) While you’ve been able to run R in SQL Server in the cloud since the release of SQL Server 2016 by running a virtual machine, Azure SQL Database is a fully-managed instance that doesn’t require you to set up and maintain the underlying infrastructure. You just choose the size and scale of the database you want to manage, and then connect to it like any other SQL Server instance. (If you want to learn how to set up an Azure SQL database, this Microsoft Learn module is a good place to start.)

Python and Java are not yet supported, but I’d imagine that they’ll be on the way too.

Comments closed

Using Azure ML To Approve Expenses Automatically

Isabelle Van Campenhoudt walks us through a scenario of using Azure ML to find expense reports which should automatically be approved, reducing the workload for approvers:

My partner in crime Serge Luca aka Doctor Flow is the author of a nice and complex expenses approval system in Microsoft Flow .
One year ago, he asked me to add analytics to his Flow.  This year he has the interesting idea to add a machine-learning based approval in his flow and suggest me to work on it. The idea is the following: Since we have a lot of approvals in our system, can a machine learn and found some decision pattern to apply automatically to each expenses request ?
I decided to use the Microsoft Azure Machine Learning Studio. In this tool you can build experiments and use some of the most common and useful machine learning algorithms. It was amazing to see how easy it is to create and consume machine learning .

This contrasts with Ginger Grant’s nightmare scenario pretty well:  instead of trying to get the ML process to do all of the work, create a process which takes care of the really easy stuff and leave harder tasks to specialists with a deeper understanding of the rules.  That way they don’t have to spend their time on trivialities.

Comments closed

The Risk Of Shadow IT In The Cloud

Kenneth Fisher walks us through the risk of increased Shadow IT with migration to the cloud:

Shadow IT has been, well, maybe not the bane of the IT department, but certainly a pain in the neck. On the off chance you’ve never heard of shadow IT do any of these sound familiar?

  • A user asks you to restore a corrupt database on a SQL Server you’ve never heard of and isn’t in your inventory. (And 50/50 odds there’s never been a backup taken.)

  • You do a licensing true-up and dozens of new SQL Servers suddenly show up.

  • You hear from a user: “We have this mission critical Access database that suddenly isn’t working. I know you don’t support access but you’re the database person so we need you to fix it.”

It’s an interesting short essay and worth thinking about if you’re in the cloud or moving that way.

Comments closed

Working With The Databricks API Via Powershell

Gerhard Brueckl has a Powershell module for interacting with Databricks, either Azure or AWS:

As most of our deployments use PowerShell I wrote some cmdlets to easily work with the Databricks API in my scripts. These included managing clusters (create, start, stop, …), deploying content/notebooks, adding secrets, executing jobs/notebooks, etc. After some time I ended up having 20+ single scripts which was not really maintainable any more. So I packed them into a PowerShell module and also published it to the PowerShell Gallery (https://www.powershellgallery.com/packages/DatabricksPS) for everyone to use!

This looks like a pretty good module if you work with Databricks.

Comments closed

Migrating A Database To Managed Instances

Frank Gill shows how to migrate a database from on-premises to an Azure SQL Managed Instance:

If you have run through my last Managed Instance blog post, you have a Managed Instance at your disposal.  The PowerShell script for creating the network requirements also contains steps to create an Azure VM in a different subnet in the same VNet.  Unless you have a site-to-site VPN or Express Route between your on-prem environment and Azure, you will use this VM to connect to your Managed Instance.

Install Management Studio on the Azure VM.  To connect to your Managed Instance, you will need the host name for your Managed Instance.  You can find the Managed Instance host name on the resource page for your Managed Instance in the Portal.

I think this migration story is a bit easier for DBAs than the old Azure SQL Database strategy of building dacpacs.

Comments closed

Priority Queuing In Azure SQL Data Warehouse

Matt How walks us through an improvement to Azure SQL Data Warehouse:

The concept of workload management is a key factor for Azure SQL DW as there is only limited concurrency slots available and depending on the resource class, these slots can fill up pretty quickly. Once the concurrency slots are full, queries are queued until a sufficiently sized slot is opened up. Let’s recap what Resource Classes are and how they affect workload management.

A Resource Class is a pre-configured database role that determines how much resource is allocated to queries coming from users that belong to that role. For example, an ETL service account may use a “large” resource class and be allocated a generous amount of the server, however an analyst may use a “small” resource class and therefore only use up a small amount of the server with their queries. There are actually 2 types of resource class, Dynamic and Static. The Dynamic resource classes will grant a set percentage of memory to a query and actual value of this percentage will vary as the Warehouse scales up and down. The key factor is that an xLargeRc (extra-large resource class) will always take up 70% of the Server and will not allow any other queries to be run concurrently. No matter how much you scale up the Warehouse, queries run with an xLargeRc will run one at a time. Conversely, queries run with a smallrc will only be allocated 4% of the Server and therefore as a Warehouse scales up, this 4% becomes a larger amount of resource and can therefore process data quicker.

This looks like a useful addition.  Click through for a few examples of how it will work.

Comments closed