Press "Enter" to skip to content

Category: Cloud

Creating an AML Workspace and Trying the Studio

Tomaz Kastrun continues an advent of Azure ML. First up, Tomaz creates a workspace:

You will select “New workspace”. For now, we will work on a workspace. But just to mention, the “New registry” will enable you to share assets among different workspaces, support multi-region replication and help you provision all resources to facilitate region replications.

From there, the focus shifts to using Azure Machine Learning Studio:

In this overview page, you can click the button “Launch studio” in the middle of the workspace or you can copy and paste the Studio web URL provided under the “Essentials” to start the Studio.

But before we launch the Studio, let’s explore some additional settings, worth mentioning.

Comments closed

Azure SQL Managed Instance Performance

Reitse Eskens wraps up a series on Azure SQL performance comparisons:

So far, the blogs were about the really SaaS databases; the database is deployed and you don’t have think about it anymore. This ease of use comes at a ‘price’. You’ve got no control whatsoever on files, you’ve lost the SQL Agent and a number of other features. The managed instance is a bit different. When I was testing you could see the TempDB files but not change them, since then a few changes have been made to this tier where you’re able to change settings and, Niko Neugebauer told the data community on twitter, there are more changes coming. With the managed instance, you get the agent back and you can run cross database query’s again. So you can safely say the managed instance is a hybrid between your trusty on-premises server and the fully managed Azure SQL database.

Click through for Reitse’s thoughts.

Comments closed

An Intro to Azure Machine Learning

Tomaz Kastrun has a new Advent challenge:

Azure Machine Learning (or Azure Machine Learning Service and abbreviation AML) is Azure’s cloud service for creating, managing and productionalising machine learning projects. It is a collaborative tool for Data Scientists, Machine Learning Engineers, and data engineers, covering their daily and operational tasks. From creating and training to deploying and managing predictive models and machine learning solutions.

Click through for the introduction.

Comments closed

Apache Ranger on ElasticMapReduce

Laurence Geng looks at Ranger:

Whether you’ve successfully made it before or not, installing and integrating Windows AD/OpenLDAP + Ranger + EMR is a very hard job, it is complicated, error-prone, and time-consuming for the following reasons:

Read on for the list of reasons, some background on Ranger, and an automated installer intended to make life a bit easier.

Comments closed

Controlling Cosmos DB Time to Live

Rahul Mehta pulls out the stopwatch:

As Microsoft states, Azure Cosmos DB “is a fully managed NoSQL database service for building scalable, high-performance applications”. Cosmos DB is widely used for storing NoSQL data with options to create using different Core (SQL), MongoDB, Cassandra, Table, and using gremlin.

With wide usage, the content storage also increases, sometimes even in Gigabytes a day. With such content storage, retention and archival of data are one of the common ask from the customer. Today, we are going to talk about how to retain data and remove unnecessary data periodically from Azure Cosmos DB. Before we do that, we need to understand a storage concept called “Container”

Read on to learn about containers, as well as the built-in way to garbage collect data.

Comments closed

Migrating Azure Analysis Services to Power BI Premium

Gilbert Quevauvilliers dumps AAS:

I thought it would be a good idea to walk through the steps when looking to migrate AAS to PBI.

In the past when I had to do this for clients it was a lot of manual steps and a lot of small things to get just right. This process is now seamless and awesome!

Reviewing Gilbert’s step-by-step process, yeah, this is easy, though watch out for the pitfalls Gilbert found.

Comments closed

Redshift Query Editor v2

Anusha Challa, et al, announce a new version of a Redshift query editor:

Amazon Redshift is a fast, fully managed, petabyte-scale cloud data warehouse. You have the flexibility to choose from provisioned and serverless compute modes. You can start loading and querying large datasets conveniently in Amazon Redshift using Amazon Redshift Query Editor v2, a web-based SQL client application.

It’s worth a try if you’re a Redshift user, though I’d imagine that frequent Redshift users have already sorted out their IDEs of choice.

Comments closed

Unity Catalog in Azure Databricks

Meagan Longoria makes a recommendation:

Unity Catalog in Databricks provides a single place to create and manage data access policies that apply across all workspaces and users in an organization. It also provides a simple data catalog for users to explore. So when a client wanted to create a place for statisticians and data scientists to explore the data in their data lake using a web interface, I suggested we use Databricks with Unity Catalog.

Read on to learn more about what the Unity Catalog does.

Comments closed

Reading Serverless SQL Pool Data with Data Factory

Koen Verbeeck wants to read from the serverless SQL pool in Azure Synapse Analytics:

We have some data we can query using the serverless SQL pools in Azure Synapse Analytics. For this blog post, I’m querying data that is stored in Azure Cosmos DB. Read the blog post How to Store Normalized SQL Server Data into Azure Cosmos DB to learn more about how that data got there.

Suppose I now want to read the data using Azure Data Factory. You can read data from Cosmos DB directly, but let’s pretend I want to do some transformations first using my favorite language: SQL. How can we do this?

Read on to learn how.

Comments closed

Loading Normalized Data into Cosmos DB

Koen Verbeeck does a bit of shuffling:

I loaded the data into a table in Azure SQL DB. For demo purposes, I want to transfer this data from a SQL table to a container in Azure Cosmos DB (with the NoSQL API). There are plenty of resources on the web on how to transfer a simple relational table to Cosmos DB, but I have some additional complexity. One column – flavor profiles – contains a list of flavors that is assigned to a beer.

Click through for one way to organize the data when dealing with arrays.

Comments closed