Press "Enter" to skip to content

Category: Cloud

Comparing Cassandra and DynamoDB

Lewis DiFelice compares and contrasts Cassandra with DynamoDB:

In this post, we’ll look at some of the key differences between Apache Cassandra (hereafter just Cassandra) and DynamoDB.

Both are distributed databases and have similar architecture, and both offer incredible scalability, reliability, and resilience. However, there are also differences,  and understanding the differences and cost benefits can help you determine the right solution for your application.

There’s some good info in this comparison.

Comments closed

Changing Azure SQL DB Service-Level Objectives

Monica Rathbun notes that SSMS lets you change service-level objectives for Azure SQL Databases:

Sometimes as a DBA, I am lazy and want the ability to execute all of my tasks in one place. Lucky for me I discovered the other day that I can change my Azure SQL Database Service Level Object options within SQL Server Management Studio (SSMS) without ever having to go to the Azure Portal.

Read on to learn how, as well as what you can change.

Comments closed

Working with Read-Only Endpoints in Azure SQL Database

Arun Sirpal takes us through one method for improving performance in Azure SQL Database:

One of the main benefits of configuring active geo-replication for Azure SQL Database is leveraging the read-only endpoint, a good technique to split away read only activity from OLTP based workloads. This means that there is no reason why you cannot point users to these databases via tools such as Power BI as highlighted below.

But there are some things to keep in mind, as Arun points out.

Comments closed

The %tensorboard Magic Command in Databricks Notebooks

Jerry Liang and Hossein Falaki introduce a new magic command in Databricks Runtime 7.2:

In 2017, we released the  dbutils.tensorboard.start() API to manage and use TensorBoard inside Databricks python notebooks. This API only permits one active TensorBoard process on a cluster at any given time – which hinders multi-tenant use-cases. Early last year, TensorBoard released its own API for notebooks via the %tensorboard python magic command. This API not only starts TensorBoard processes but also exposes the TensorBoard’s command line arguments in the notebook environment. In addition, it embeds the TensorBoard UI inside notebooks, whereas the dbutils.tensorboard.start API prints a link to open TensorBoard in a new tab.

Read on to see how you can use it.

Comments closed

Managing Azure IoT Digital Twins with Python

Paul Hernandez continues a series on digital twins:

As mentioned in my previous post, Azure Digital Twins (ADT) Service is a new way to build next generation IoT solutions. In the first post I show you in a video how to manage ADT instances with the ADT Explorer. In the second post I show how to do mostly the same but using Postman and the ADT Rest API.

ADT has control plane APIs and data plane APIs. The latest is used to manage elements in the ADT instance. In order to use these APIs Microsoft published a .Net (C#) SDK. And SDK is a convenient way to manage instances, since you can easily create applications for your digital twins. If for any reason you prefer to use another language like java, javascript or python, you need to generate your own SDK.

In this post I describe how to autogenerate a Python SDK using the tool Autorest and a swagger file.

Read on to see how you can manage digital twins by generating an SDK.

Comments closed

A Review of Azure Synapse Analytics

Teo Lachev looks at Azure Synapse Analytics:

There is plenty to like in Azure Synapse which is the evaluation of Azure SQL DW. If you’re tasked to implement a cloud-based data warehouse, you have a choice among three Azure SQL Server-based PaaS offerings, including Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse. In a nutshell, Azure SQL Database and Azure SQL MI are optimized for OLTP workloads. For example, they have full logging enabled and replicate each transaction across replicas. Full logging is usually a no-no for decent size DW workloads because of the massive ETL changes involved.

In addition, to achieve good performance, you’ll find yourself moving up the performance tiers and toward the price point of the lower Azure Synapse SKUs. Not to mention that unlike Azure SQL Database, Azure Synapse can be paused, such as when reports hit a semantic layer instead of DW, and this may offer additional cost cutting options.

Teo focuses primarily on SQL Pools and the more SQL-friendly side of things (ELT and Power BI) rather than Spark pools.

Comments closed

ADF Data Flows and Joins Failing During Debugging

Mark Kromer clears up some issues around debugging in Azure Data Factory:

One of the important features built into ADF is the ability to quickly preview your data while designing your data flows and to execute the finished product against a sampling of data prior to finalizing and operationalizing your pipelines.

However, there are a few fundamentals relative to working with Joins that you should keep in mind and a few details below are important to understand at design time and while debugging / testing.

The answer makes sense but it would not have been the first thing to come to mind for me.

Comments closed

Statistics Management with Azure SQL DB Serverless

Joey D’Antony takes us through stats management with the serverless tier of Azure SQL Database:

One of the only things platform as a service databases like Azure SQL Database do not do for you is actively manage column and index statistics. While backups, patches, and even integrity checks are built into the platform services, managing your metadata is not. Since Azure SQL Database lacks a SQL Sever Agent for scheduling, you have to use an alternative for job scheduling. 

Read on to learn about techniques as well as a few gotchas.

Comments closed

Securing Application Secrets with Azure Key Vault

Rishit Mishra walks us through Azure Key Vault:

As the name suggests, Azure Key Vault is used to store and manage keys securely. Key Vault can be used to store the cryptographic secrets and keys such as authentication keys, storage account keys, data encryption keys, passwords and certificates.

Azure Key Vault enables developers to create the keys for development and testing in minutes, and they can further migrate this setup seamlessly onto the production environment.

The centralized key store/vault can be securely managed by the Key Vault owner who manages permissions to this key store and would be responsible for keeping the secrets secure.

Key Vault becomes quite useful in managing secrets in tools like Azure Databricks and Azure Data Factory without saving a bunch of keys in configuration files. And it’s a lot safer than that option, too.

Comments closed

Dropping Unused Indexes in Azure SQL DB

Monica Rathbun gives an important lesson around tracking index utilization in Azure SQL Database:

If the index has not shown any utilization I investigate to determine if it is one that can be removed. However, this week something caught my attention. I was looking at a client’s indexes and noted the values for these were not as high as I would have expected. I know that these index statistics are reset upon every SQL Server Service restart, but in this case, I was working on an Azure SQL Database. which got me wondering exactly how that worked. With an Azure Virtual Machine or an on Prem SQL Server instance this is easy to figure out. But with an Azure SQL Database we do not have control over when restarts are done, and what about the Serverless offering (which pauses unutilized databases to reduce costs), how do those behave?  I really want to make sure before I remove any indexes from a database that I am examining the best data possible to make that decision. So, I did some digging.

Read on to see what Monica discovered.

Comments closed