Press "Enter" to skip to content

Category: Cloud

Apache Ranger on ElasticMapReduce

Laurence Geng looks at Ranger:

Whether you’ve successfully made it before or not, installing and integrating Windows AD/OpenLDAP + Ranger + EMR is a very hard job, it is complicated, error-prone, and time-consuming for the following reasons:

Read on for the list of reasons, some background on Ranger, and an automated installer intended to make life a bit easier.

Comments closed

Controlling Cosmos DB Time to Live

Rahul Mehta pulls out the stopwatch:

As Microsoft states, Azure Cosmos DB “is a fully managed NoSQL database service for building scalable, high-performance applications”. Cosmos DB is widely used for storing NoSQL data with options to create using different Core (SQL), MongoDB, Cassandra, Table, and using gremlin.

With wide usage, the content storage also increases, sometimes even in Gigabytes a day. With such content storage, retention and archival of data are one of the common ask from the customer. Today, we are going to talk about how to retain data and remove unnecessary data periodically from Azure Cosmos DB. Before we do that, we need to understand a storage concept called “Container”

Read on to learn about containers, as well as the built-in way to garbage collect data.

Comments closed

Migrating Azure Analysis Services to Power BI Premium

Gilbert Quevauvilliers dumps AAS:

I thought it would be a good idea to walk through the steps when looking to migrate AAS to PBI.

In the past when I had to do this for clients it was a lot of manual steps and a lot of small things to get just right. This process is now seamless and awesome!

Reviewing Gilbert’s step-by-step process, yeah, this is easy, though watch out for the pitfalls Gilbert found.

Comments closed

Redshift Query Editor v2

Anusha Challa, et al, announce a new version of a Redshift query editor:

Amazon Redshift is a fast, fully managed, petabyte-scale cloud data warehouse. You have the flexibility to choose from provisioned and serverless compute modes. You can start loading and querying large datasets conveniently in Amazon Redshift using Amazon Redshift Query Editor v2, a web-based SQL client application.

It’s worth a try if you’re a Redshift user, though I’d imagine that frequent Redshift users have already sorted out their IDEs of choice.

Comments closed

Unity Catalog in Azure Databricks

Meagan Longoria makes a recommendation:

Unity Catalog in Databricks provides a single place to create and manage data access policies that apply across all workspaces and users in an organization. It also provides a simple data catalog for users to explore. So when a client wanted to create a place for statisticians and data scientists to explore the data in their data lake using a web interface, I suggested we use Databricks with Unity Catalog.

Read on to learn more about what the Unity Catalog does.

Comments closed

Reading Serverless SQL Pool Data with Data Factory

Koen Verbeeck wants to read from the serverless SQL pool in Azure Synapse Analytics:

We have some data we can query using the serverless SQL pools in Azure Synapse Analytics. For this blog post, I’m querying data that is stored in Azure Cosmos DB. Read the blog post How to Store Normalized SQL Server Data into Azure Cosmos DB to learn more about how that data got there.

Suppose I now want to read the data using Azure Data Factory. You can read data from Cosmos DB directly, but let’s pretend I want to do some transformations first using my favorite language: SQL. How can we do this?

Read on to learn how.

Comments closed

Loading Normalized Data into Cosmos DB

Koen Verbeeck does a bit of shuffling:

I loaded the data into a table in Azure SQL DB. For demo purposes, I want to transfer this data from a SQL table to a container in Azure Cosmos DB (with the NoSQL API). There are plenty of resources on the web on how to transfer a simple relational table to Cosmos DB, but I have some additional complexity. One column – flavor profiles – contains a list of flavors that is assigned to a beer.

Click through for one way to organize the data when dealing with arrays.

Comments closed

Invoke External REST Endpoints from Azure SQL DB

Rob Farley is impressed:

This internal procedure is new in Azure SQL DB in 2022. I think it presents a significant change to the way we do things in the world of SQL, and makes some other tools a whole lot more useful as well.

sp_invoke_external_rest_endpoint lets me send data to a REST API from within a stored procedure. Invoking an HTTP REST endpoint – as simple as that. And while I know you’re probably thinking, “But I can send data to a REST API from anywhere – why do I need to do it from within a stored procedure?”, I want to describe a few scenarios to you.

I like having the functionality, though would want to control how frequently my teams would use it. The reason is that this potentially makes your database the a domain boundary (when thinking in domain-driven design concepts).

Comments closed

Continuous Backup for Cosmos DB

Manvendra Singh wants a backup:

This article will explore Continuous backup and steps to configure it for a new Azure Cosmos DB account or an existing Cosmos DB account. Azure Cosmos DB is a fully managed and highly secure, NoSQL database service on the Azure cloud that is designed for modern-day application development. It automatically runs backup for its databases on separate Azure blob storage at regular intervals without affecting the performance, availability, and provisioned resource units (RUs) to ensure data protection from a data recovery standpoint which can be needed in case of data corruption, deletion, or wrongly data updates.

Click through for the process and some limitations.

Comments closed

Monitoring Azure SQL DB Restore Progress

Sudhir Raparia doesn’t have time to wait:

Database Backup & Restore capabilities are crucial for ensuring Business continuity and Disaster recovery. Restore database operation is usually done in critical situations like hardware failure, application errors, ransomware attacks, accidental deletion of database etc., to restore a production database to latest known stable state. In such critical situations users would want to track the progress of restore operation accurately so that they can plan for subsequent actions and/or alternatives.

Currently in Azure SQL DB, you can view the database restore progress either using Portal or using T-SQL as follows:

Click through for information on that DMV, as well as a recent change to it in Azure SQL DB (though not yet Azure SQL Managed Instance).

Comments closed