Press "Enter" to skip to content

Category: Cloud

Postgres Change Data Capture into Kafka

Abhishek Gupta walks us through an example of change data capture to track events:

Change Data Capture (CDC) is a technique used to track row-level changes in database tables in response to create, update and delete operations. Different databases use different techniques to expose these change data events – for example, logical decoding in PostgreSQLMySQL binary log (binlog) etc. This is a powerful capability, but useful only if there is a way to tap into these event logs and make it available to other services which depend on that information.

Debezium does just that! It is a distributed platform that builds on top of Change Data Capture features available in different databases. It provides a set of Kafka Connect connectors which tap into row-level changes (using CDC) in database table(s) and convert them into event streams. These event streams are sent to Apache Kafka which is a scalable event streaming platform – a perfect fit! Once the change log events are in Kafka, they will be available to all the downstream applications.

Click through for the demo, using Azure components.

Comments closed

Storing SQL Server Database Files in Blob Storage

Tomaz Kastrun has a wacky idea:

Storing SQL Server database files in Azure blob storage is a great solution for all the databases that are often migrated between instances, servers, virtual machines, or would have been divided between instances. This scenario also has the positive aspect to it, since the ability to create snapshot backups to Azure is seamless.

Following the steps, we will create a Azure Blob storage, where MSSQL Server database files will reside with MSSQL Server running on-prem. Assuming, that you already have the Azure account (if not, you can get a free Azure account), let’s proceed by opening the Windows Terminal in PowerShell mode.

I’m impressed that it worked and could see it being an option for small demo databases, but I can’t imagine performance would be good enough for a production scenario.

Comments closed

Loading Data from S3 into Power BI

Gilbert Quevauvilliers loves a challenge:

I really enjoy a good challenge, and with my customer they have all their data stored in AWS S3. Whilst there is no native connector, I thought there must be a way for me to get the data from AWS S3 into Power BI.

I did a bit of Googling and could not find any suitable solution. I also found and learnt that I could use AWS Athena to query the data living in S3. (I am definitely an expert of have a lot of knowledge in the AWS space. I am fortunate that I have other people who know AWS and were able to setup, configure and give me the details to connect to S3 via AWS Athena)

Below are the steps on how I got this working.

Why they don’t have a proper connector is a bit of a head-scratcher to me given the sheer amount of data stored in S3 and the sheer number of connectors in Power BI.

Comments closed

Transforming JSON to CSV: ADF vs Databricks

Rayis Imayev compares two methods of transforming a JSON-structured data set into a CSV:

There is a well known and broadly advertised message from Microsoft that Azure Data Factory (ADF) is a code-free environment to help you to create your data integration solutions – https://azure.microsoft.com/en-us/resources/videos/microsoft-azure-data-factory-code-free-cloud-data-integration-at-scale/. I agree and support this approach of using drag and drop visual UI to build and automate data pipelines without writing code. However, I’m also interested to try if I can recreate certain ADF operations by writing code, just out of my curiosity.

Rayis includes a link to the Azure Data Factory step-by-step demonstration and then kicks it up a notch with Databricks. Read on to see how the two compare.

Comments closed

Calculating Cloud App Availability

Dave Bermingham gives you a way to calculate how available you should expect your application to be given SLAs:

When deploying business critical applications in the cloud you want to make sure they are highly available. The good news is that if you plan properly, you can achieve 99.99% (4-nines) of availability or more. However, calculating your true availability may not be as straightforward as it seems.

When considering availability you must consider the key components that make access to your application possible, which I’ll call the availability chain. Component of the availability chain are:

– Compute
– Network 
– Storage
– Application
– Dependent services

Your application is only as available as your weakest link, and your downtime increases exponentially with each additional link you add to the chain.  Let’s examine each of the links. 

Read on for a breakdown of these items.

Comments closed

Understanding Digital Twins in IoT Hub

Paul Hernandez explains the concept of digital twins in the IoT space:

Azure Digital Twins Service offers a way to build next generation IoT solutions. There are other approaches on the market to describe IoT devices and build digital twins. Without making a formal comparison I can say with the Azure Digital Twins is possible to build a powerful semantic layer on top of your connected devices using domain specific models.

To show you how this work let’s create a kind of “hello world” example. An end-to-end solution is out-of-scope of this post. Instead I will create some hands-on tutorial to demonstrate some of the functionalities.

Click through to see an example.

Comments closed

Assuming a Role with AWS Powershell Tools

Sheldon Hull solves a problem:

I’ve had some issues in the past working with AWS.Tools PowerShell SDK and correctly assuming credentials.

By default, most of the time it was easier to use a dedicated IAM credential setup for the purpose.

However, as I’ve wanted to run some scripts across multiple accounts, the need to simplify by assuming a role has been more important.

It’s also a better practice than having to manage multiple key rotations in all accounts.

Read on to see how far Sheldon has been able to take this, but also how much more work is left to do.

Comments closed

Disk Caching with SQL Server VM Disks in Azure

Niko Neugebauer performs some tests:

Microsoft has been extremely clear in the best practices recommendation for the SQL Server workloads on Azure VMs:
– use read caching for the data drives/storage pools
– use no caching for the log drives/storage pools
– use read caching for the temp db drives/storage pools

Sounds simple and direct, isn’t it ?
Let me borrow your attention for the next couple of minutes pointing to some situations where you might want to reconsider the best practices.

But do read on for some important notes.

Comments closed

Auto-Shutdown an Azure VM and Notify You on Slack

Daniel Hutmacher has a fun assignment:

Virtual machines cost money when they’re powered on. Most servers obviously need to be on 24 hours a day. Others, like development machines, only have to be on when you’re using them. And if you forget to turn them off, they’ll empty out your Azure credits (or your credit card) before you know it.

Today, I’ll show you how to set an Auto-shutdown time to turn a VM off if you forget, as well as have Azure notify you on Slack 30 minutes ahead of time, so you have the option to postpone or cancel the shutdown.

There are a few steps to the process, but everything is straightforward.

Comments closed

Azure SQL Database Business Continuity Options

James Serra covers business continuity scenarios with Azure SQL Database:

I have wrote a number of blogs on the topic of business continuity in SQL Database before (HA/DR for Azure SQL DatabaseAzure SQL Database high availabilityAzure SQL Database disaster recovery) but with a number of new features I felt it was time for a new blog on the subject, focusing on disaster recovery and not high availability.

Business continuity in Azure SQL Database and SQL Managed Instance refers to the mechanisms, policies, and procedures that enable your business to continue operating in the face of disruption, particularly to its computing infrastructure. In the most of the cases, SQL Database and SQL Managed Instance will handle the disruptive events that might happen in the cloud environment and keep your applications and business processes running.

James takes us through options available for Azure SQL Database as well as managed instances.

Comments closed