Press "Enter" to skip to content

Category: Cloud

Hyperscale and RBIO_RG_STORAGE

Reitse Eskens runs into a strange bug in Azure SQL Database Hyperscale:

This single wait made sure our complete environment went dead in the water. Everything halted. To get some context, Microsoft has some documentation on this wait:

Occurs when a Hyperscale database primary compute node log generation rate is being throttled due to delayed log consumption at the page server(s).

Well, that’s not really helping, because that’s about everything they tell you about it.

Click through for Reitse’s findings and Microsoft’s advice.

Comments closed

Secure Cluster Connectivity in Azure Databricks

Abhinav Garg and Premal Shah have an announcement:

We’re excited to announce the general availability of Secure Cluster Connectivity (also commonly known as No Public IP) on Azure Databricks. This release applies to Microsoft Azure Public Cloud and Azure Government regions, in both Standard and Premium pricing tiers. Hundreds of our global customers including large financial services, healthcare and retail organizations have already adopted the capability to enable secure and reliable deployments of the Azure Databricks unified data platform. It allows them to securely process company and customer data in private Azure Virtual Networks, thus satisfying a major requirement of their enterprise governance policies.

Read on fore more detail about how this works.

Comments closed

Connecting Confluent and Databricks on Azure

Angela Chu, et al, take us through a streaming data ingestion process:

How do you process IoT data, change data capture (CDC) data, or streaming data from sensors, applications, and sources in real time? Apache Kafka® and Azure Databricks are widely adopted technologies in the industry, but they require specific skills and expertise to run. Leveraging Confluent Cloud and Azure Databricks as fully managed services in Microsoft Azure, you can implement new real-time data pipelines with less effort and without the need to upgrade your datacenter (or set up a new one).

This blog post demonstrates how to configure Azure Databricks to interact with Confluent Cloud so that you can ingest, process, store, make real-time predictions and gain business insights from your data.

Click through for a detailed demonstration.

Comments closed

Long-Term Backups on Azure SQL Database

Arun Sirpal takes us through a fairly new feature in Azure SQL Database:

There is a new (ish) interface to looking and configuring backups for your Azure SQL Database. This can be found within the settings section of the SQL Server.

As you can see, by default we have 7 days retention to allow for PITR – Point In Time Recovery, anything longer you will need to setup long term retention.

Click through to see how to set this up.

Comments closed

Optimizing a SQL Server 2019 Project for a Dedicated SQL Pool

Kevin Chant shows us how we can modify a database schema intended for SQL Server 2019 to work best with an Azure Synapse Analytics dedicated SQL pool:

In this post I want to cover how you can transform your SQL Server database schema for a dedicated SQL Pool if you are using Azure DevOps. Because I covered it at Data Toboggan over the weekend and it can be very useful.

By the end of this post, you will know one way you can transform the schema of a database project for SQL Server 2019 if you are using Azure DevOps. So that you can make it optimal for dedicated SQL Pools.

Click through for the process and an example. Note that this isn’t a quick “check this box and you’re done” type of solution, but if you already have a proper star schema, this will help you think through some of the things you’ll need to do.

Comments closed

A Review of AWS Athena

John McCormack updates an older review:

AWS’s own documentation is the best place for full details on the Athena offering, this post hopes to serve as further explanation and also act as an anchor to some more detailed information. As it is a managed service, Athena requires no administration, maintenance or patching. It’s not designed for regular querying of tables in a way that you would with an RDBMS. Performance is geared around querying large data sets which may include structured data or semi-structured data. There are no licensing costs like you may have with some Relational Database Management Systems (RDBMS) such as SQL Server and costs are kept low, as you only pay when you run queries in AWS Athena.

Click through for an overview of product benefits.

Comments closed

Using the Cloudshell Code Editor

Hamish Watson shows off the cloudshell code editor in Azure:

This post is about code which is an online editor you can use with cloudshell.

I live in the Azure platform all day (almost) every day. What this means is I need to be able to run scripts from just about any device – there are times where I will be scaling a Hyperscale database at 10:30pm from my phone….

So I use cloudshell a LOT in Azure

Click through for more details.

Comments closed

AzureCosmosR

Hong Ooi takes us through an R library for working with Cosmos DB:

Among other features, Azure Cosmos DB is notable in that it supports multiple data models and APIs. When you create a new Cosmos DB account, you specify which API you want to use: SQL/core API, which lets you use a dialect of T-SQL to query and manage tables and documents; MongoDB; Azure table storage; Cassandra; or Gremlin (graph). AzureCosmosR provides a comprehensive interface to the SQL API, as well as bridges to the MongoDB and table storage APIs. On the Resource Manager side, AzureCosmosR extends the AzureRMR class framework to allow creating and managing Cosmos DB accounts.

AzureCosmosR is now available on CRAN. You can also install the development version from GitHub, with devtools::install_github("Azure/AzureCosmosR").

Hong provides examples for us using three of the Cosmos DB APIs, so check it out.

Comments closed

Logical Separation in Azure Data Factory

Rayis Imayev is at a crossroads:

I was raised listening and reading fairy tales where the main character would reach a crossroad with a large stone that had some directions written on it – turn right and you will lose your horse, turn left and you will lose your life, walk straight and you will find your happiness. 

Also, growing up in a small Ukrainian industrial city, closely situated to a railroad hub, I was always fascinated to see many colorful rail traffic lights, trying to imagine where a myriad of rail tracks would lead trains on them.

Similarly, Azure Data Factory (ADF) provides several ways, to control/direct/filter your pipeline workflows; it’s all conditioned and constrained to the boundaries of my “crossroad stone” writings.

As one of my intellectual heroes is purported to have said, if you see a fork in the road, take it.

Comments closed

Deciding to Use Dataflows or Power Query

Paul Turley explains a choice:

When should you use dataflows vs regular Power Query? I didn’t jump on the dataflows bandwagon and struggled to adopt them at first. Frankly, Power Query is easier to use. The browser-based dataflows designer is quite impressive but it is not as responsive and convenient as the desktop app, so this is a bit of a trade-off. The power and value of dataflows becomes apparently when the business reaches a certain stage of data culture maturity.

Before we can address the question of whether to use Power BI Dataflows, conventional Power BI queries, or any other approach to source and transform data; we need to briefly review different options for orchestrating a Business Intelligence solution in the Microsoft cloud ecosystem.

Read on for a bit of architecture and the explanation.

Comments closed