Cloud – Page 66 – Curated SQL

Loading JSONB into Postgres via Azure Data Factory

Published 2022-06-01 by Kevin Feasel

Requirements:
1. Sourcing data comes from a SQL Server database
2. The destination is a PostgreSQL database table
3. Transformation logic is to aggregate several rows from a sourcing table and populate the resulting JSON structured document into a single row JSONB type column

Read on for Rayis’s notes.

Comments closed

Model Deployment Options in Azure

Published 2022-05-30 by Kevin Feasel

Tori Tompkins enumerates ways to deploy machine learning models in Azure:

There are so many options to deploy models in Azure that is can get quite overwhelming. In this blog, we break down all the available options and consider the pros and cons of each tooling option.

Even with those, there are other approaches as well, like hosting Spark-based models in Azure Synapse Analytics, using SQL Server Machine Learning Services on an Azure SQL Managed Instance or VM running SQL Server, etc.

Comments closed

Creating a SQL Server 2022 dacpac

Published 2022-05-27 by Kevin Feasel

Kevin Chant gets an upgrade:

In this post I want to cover how you can create a dacpac for SQL Server 2022 databases using sqlpackage. So that you keep the new SQL Server 2022 compatibility level when you deploy new databases.
Just to clarify, a dacpac file is a special type of file which contains details about SQL Server database objects. Which you can use to deploy database updates to other SQL Server databases.

Read on for initial thoughts, a post-upgrade experience, and more.

Comments closed

Minimum Viable Data Mesh in Azure

Published 2022-05-24 by Kevin Feasel

Paul Andrew was on a podcast:

For Paul, delivering a single data mesh data product on its own is not all that valuable – if you are going to go to the expense of implementing data mesh, you need to be able to satisfy use cases that cross domains. And the greater value is in cross-domain interoperability, getting to a data product that wasn’t possible before. And, you need to deliver the data platform alongside those first 2-3 data products, otherwise you create a very hard to support data asset, not really a data product.
When thinking about minimum viable data mesh, Paul views an approach leveraging DevOps and generally CI/CD – or Continuous Integration/Continuous Deliver – as very crucial. You need repeatability/reproducibility to really call something a data product.

Click through for the interview as well as Scott Hirleman’s summary.

Comments closed

Finding Indexing Metrics in Cosmos DB

Published 2022-05-20 by Kevin Feasel

Hasan Savran looks at the numbers:

You might need Composite Indexes to make your queries more efficient, Cosmos DB does not create any Composite Indexes for you. You need to figure out which properties should have composite indexes then you need to change the indexing policy file to create them.
Indexing Metrics comes to your help when you need help with indexing policy. It tells you which indexes the current query uses and it gives you hints about what other indexes you should create to make the query work faster/cheaper. Like many other features of Cosmos DB, you need to write code by using SDK to see Indexing Metrics. The following example shows how to enable Indexing Metrics for your queries.

Click through for a code sample which shows how to collect index metrics.

Comments closed

Troubleshooting Firewall Issues with Azure SQL MI

Published 2022-05-20 by Kevin Feasel

Emanuele Meazzo sees a problem pop up regularly:

Here is something that will save you lots of time and headaches when trying to connect to Azure SQL Managed Instances, especially from onprem servers or from other clouds; I had to repeat this multiple times to multiple actors, so I know it will happen to someone else too.
In most cases, “Connect Timeout” and/or “Cannot open server xxx requested by the login; Login failed” errors are caused by the firewall configuration and a lack of understanding the SQLMI networking model, let me explain:

Read on for that explanation.

Comments closed

Cost Savings with Azure Data Factory

Published 2022-05-20 by Kevin Feasel

Koen Verbeeck maximizes the savings:

As you might’ve noticed, pricing in ADF is not the same as it was in SSIS for example. In SSIS, you pay your SQL Server license and you’re done (well, and you buy a server to run it on). It doesn’t matter what you do with SSIS, the cost is the same. If you run 1 package or 1000 packages, there’s no difference except in your electricity bill. However, in ADF you pay more if you use it more. You pay for each action you do, you pay for each activity you use and for how long things are running. There are a couple of guidelines you can follow to try to minimize costs:

Read on for those guidelines and some specific helpful items.

Comments closed

Azure Resource Locks

Published 2022-05-19 by Kevin Feasel

Craig Porteous explains the benefit (and pain) behind resource locks in Azure:

In theory, these are perfect for preventing accidental (or deliberate) deletion of resources in Azure. They don’t prevent the deletion of data though, only operating at the “control plane” of a resource. That still sounds great though. Turn them on everywhere! That’s another layer of security in your cloud data platform. Right?

Yeah, here’s where the pain comes in. I tried using resource group locks but there are some resources which use delete capabilities, such as Azure Media Service. A delete lock means no ability to delete uploaded videos.

Comments closed

Low-Code Churn Prediction with Synapse Analytics

Published 2022-05-18 by Kevin Feasel

Gavita Regunath shows off a capability in Azure Synapse Analytics:

We will build a machine learning solution to predict churn using Azure Synapse Analytics and Azure Machine Learning.
Azure Synapse Analytics is Microsoft’s limitless analytics platform that combines enterprise data warehousing and big data analytics. In simple terms, it is a one-stop-shop that allows you to ingest, prepare, and manage data that can then be used for machine learning and business intelligence, all from a single place. It provides a unified platform and encourages collaboration between data and machine learning professionals.
This article will show you how to build an end-to-end solution to train a machine learning model from Azure Synapse analytics using AutoML functionality within Azure Machine Learning. Using the T-SQL Predict statement, we can then use the trained machine model to make predictions against the churn dataset stored in the SQL Pool table. One of the key benefits of working from within Azure Synapse is that all the necessary steps required to train and make predictions with the trained model can be done from a single platform, Azure Synapse.

Click through for the three-step process and a demonstration.

Comments closed

Replacing Common Table Expressions in ADF Dataflows

Published 2022-05-16 by Kevin Feasel

Jeet Kainth needs an alternative:

At the time of writing, it is not possible to write a query using a CTE in the source of a dataflow. However, there are a few options to deal with this limitation:
– re-write the query using subqueries instead of CTEs
– use a stored procedure that contains the query and reference the stored proc in the source of the dataflow
– write the query as a view and reference the view in the source of the dataflow (this is my preferred method and the one I will demo here)

Jeet focuses on the third alternative. I’d lean toward the second or the third alternative, myself. Probably the second one (stored procedures) but both allow me to create an interface between ADF and the database. That way, underlying table changes will be less likely to require me to make code changes in ADF.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Category: Cloud