Press "Enter" to skip to content

Category: Cloud

Cloning Tables in Databricks

Chen Hirsh hogs the photocopier:

The simplest use case to explain why table cloning is helpful is this: Let’s say you have a large table, and you want to test some new process on it, but you don’t want to ruin the data for other processes, so you need a clean copy of your table (or multiple tables) to play with. Coping a large table might take time (Databricks does it very fast, but if it’s a big table it still takes time to copy the data) ,and what happens if you then need to change your code? you have to drop the target table, copy the source table again, and so on.

here is where cloning can be your friend.

Read on to learn about three cloning techniques. H/T Madeira Data Solutions blog.

Comments closed

Building a GitHub Codespace Configuration for Polyglot Notebooks

Matt Eland makes some recommendations:

In order to get Polyglot Notebooks to work with GitHub Codespaces, you’ll need to match the current requirements of the Polyglot Notebooks extension and its underlying .NET Interactive kernels.

This relies on two files in your .devcontainer directory:

  • Dockerfile which describes the Docker container the Codespace will run in
  • devcontainer.json which describes how the dev container is configured in terms of extensions and ports

Read on to learn more. Also, Matt has a brand new book available on the topic of polyglot notebooks, so check that out.

Comments closed

Azure SQL DB Hyperscale Elastic Pools now GA

Arvind Shyamsundar has an announcement:

Azure SQL Database is the preferred database technology for hundreds of thousands of customers. Built on top of the rock-solid SQL Server engine and leveraging leading cloud-native architecture and technologies, Azure SQL Database Hyperscale offers leading performance, scalability and elasticity with one of the lowest TCO in the industry .

While you may start with a standalone Hyperscale database, chances are that as your fleet of databases grows, you want to optimize price and performance across a set of Hyperscale databases. Elastic pools offer the convenience of pooling resources like CPU, memory, IO, while ensuring strong security isolation between those databases.

Read on to learn more about what it offers and what it costs.

Comments closed

Microsoft Fabric Capacities and Reserved Instances

Marc Lelijveld shares an experience:

Last week, I had a situation in which a client wanted to purchase a reserved instance Fabric capacity. Me being me, I assumed it would be super straight forward to purchase through Azure. However, at some point I was lost in the process where the official documentation confused even more. In the end, I figured out and managed to get a capacity running based on the Reserved Instance pricing. I didn’t find any other blogs or articles describing this confusion or specific case. Therefore, I decided to write down my thoughts and findings in a blog.

This blog is not only relevant if you work with Microsoft Fabric, but also for anyone currently working with Power BI Premium. Given the deprecation of Power BI Premium capacities, you have to switch to Fabric capacities sooner or later.

Read on to learn more about the differences between pay-as-you-go and reserved instance capacities, the process to make a reservation, and what comes after that before you have a Microsoft Fabric capacity ready to go.

Comments closed

Real-Time Streaming in Azure

Temidayo Omoniyi takes us through an architecture:

In today’s world, billions of data are generated daily from messaging applications like WhatsApp, financial data like the New York Stock Exchange, or video streaming platforms like YouTube. As a data engineer or solution architect, you are tasked to design a real-time streaming platform that captures the data as they are generated and stored in the necessary storage for decision-making.

This does a great job of going into detail, not only at the architectural level, but also setup and practical implementation.

Comments closed

Exploring Azure Data Storage Options

Anurag K talks about data storage:

Choosing the right data storage option in Azure is critical for ensuring your applications run efficiently, securely, and cost-effectively. In this blog post, we’ll dive deeper into three key Azure storage services: Blob Storage, File Storage, and Disk Storage. We’ll explore their features, use cases, and provide specific examples to help you understand how to best leverage these services in your cloud environment.

I would note here that within the Blob storage section, you could carve out an explanation of Data Lake Storage Gen2, which starts from Blob storage and then adds hierarchical namespaces (i.e., folders).

Comments closed

Running a Microsoft Fabric Notebook via Azure DevOps

Kevin Chant runs a notebook:

In this post I want to share one way that you can run a Microsoft Fabric notebook from Azure DevOps.

You can consider this post a follow-up to my last post about unit tests on Microsoft Fabric items. Since somebody asked me about automating notebooks and I wanted to show it in action.

Please note, currently the ability to call the API that runs a notebook on demand does not support service principals.

Despite that limitation, Kevin shows two ways to authenticate while calling the appropriate API.

Comments closed

T-SQL Snapshot Point-in-Time Recovery to Azure VM

Anthony Nocentino continues a series on T-SQL snapshot backups:

In this post, the third in our series on using T-SQL Snapshot Backup, I will guide you through using the new T-SQL Snapshot Backup feature in SQL Server 2022 to take a snapshot backup and then perform point-in-time database restores using that snapshot backup as the base, but this time using an Azure Virtual Machine. We will explore how to manage Azure storage-level operations, such as taking snapshots, cloning snapshots, and executing an instantaneous point-in-time database restore from the snapshot with minimal impact on your infrastructure. Additionally, I will demonstrate a PowerShell script that utilizes dbatools and Azure Az modules to automate the process.

Read on for the script and plenty of details.

Comments closed