Cloud – Page 99 – Curated SQL

In 2019, AWS unveiled Amazon SageMaker Debugger, a SageMaker capability that enables you to automatically detect a variety of issues that may arise while a model is being trained. SageMaker Debugger captures model state data at specified intervals during a training job. With this data, SageMaker Debugger can detect training issues or anomalies by leveraging built-in or user-defined rules. In addition to detecting issues during the training job, you can analyze the captured state data afterwards to evaluate model performance and identify areas for improvement. This task is made easier with the newly launched XGBoost training report feature. With a minimal amount of code changes, SageMaker Debugger generates a comprehensive report outlining key information that you can use to evaluate and improve the model.
This post shows you an end-to-end example of training an XGBoost model on Sagemaker and how to enable the automatic XGBoost report functionality in Sagemaker Debugger to quickly and easily evaluate model performance and identify areas of improvement for your model. Even if you don’t have a lot of data science experience, you can still gauge how well the model performs and identify areas of improvement based on information provided by the report. The code from this post is available in the GitHub repo.

Click through for an example of this in action.

Comments closed

Using Terraform to Tag Created Date

Published 2021-03-03 by Kevin Feasel

John Martin has an interesting use case for tagging in Terraform:

One of the key properties missing from Azure resources, in my opinion anyway, is a CreatedDate. This can be largely overcomes with Azure policy, but what if you don’t have access to create one that applies a timestamp tag at resource creation?
It is possible to use Terraform to tag the resource and set the value for when the resource is created. There is a little more work that needs to go into it to ensure that once it is set that Terraform does not overwrite it on subsequent deployments. But, it is achievable and brings this into your control if needed.

Click through to see how.

Comments closed

A Warning on Using Distributed Network Names

Published 2021-02-24 by Kevin Feasel

Allan Hirt has a warning for us:

DNNs are supported as of SQL Server 2019 CU2 and require Windows Server 2016 or later. I wrote more about them in my blog post Configure a WSFC in Azure with Windows Server 2019 for AGs and FCIs. Go there if you want to see what they look like and learn more.
Right now, I cannot wholeheartedly recommend the use of DNNs for listeners or FCIs if you are using Enterprise Edition. Why?

Read on to learn why.

Comments closed

Azure Data Factory Pipeline Outcomes

Published 2021-02-18 by Kevin Feasel

We have proof that Meagan Longoria is a consultant:

Question: When an activity in a Data Factory pipeline fails, does the entire pipeline fail?
Answer: It depends

Read on to understand the circumstances of when a pipeline activity failure will doom an entire pipeline run and when recovery might be possible.

Comments closed

Azure Data Factory and JSON Array Hand-Offs

Published 2021-02-17 by Kevin Feasel

Rayis Imayev wants to pass a JSON array from one Azure Data Factory pipeline to another:

This next post came out of an error message during my attempt to pass a hard-coded array value between pipelines. Strangely, this use-case worked well in the pipeline that was already deployed in ADF, however, I was getting an error message while trying to test and execute this very same pipeline in a Debug mode.

Click through for the explanation.

Comments closed

Installing SQL Server on an Azure VM

Published 2021-02-15 by Kevin Feasel

Niels Berglund takes us through the steps of creating an Azure VM running SQL Server:

A while ago, I wanted to do a quick test on a new SQL installation, and I wanted the SQL installation to be on a “pristine” server. I was not keen on creating a new virtual machine on my local dev-box, as for that I would need to create a VM image etc., and it seemed like too much hassle for a lazy person like me. The obvious choice then is to do it in the cloud. How hard can that be, what could possibly go wrong?!
It turned out to not be as straight-forward as I thought it would be, but eventually, I managed to get it right. Since I probably need to do it again some time, I thought I’d write a post about it, so I have something to go back to. So here we go …

Niels goes through this in meticulous detail, as is the norm.

Comments closed

Availability Groups and the Shakes

Published 2021-02-15 by Kevin Feasel

Niko Neugebauer coins a term:

Disclaimer: I am using the word shake by my own initiative and no Microsoft Documentation ever to my knowledge ever mentioned that situation. Those shakes are represented most of the time as health events to the cluster, such as the Lease Timeout resulting in a sudden attempt of Failover.
Why did I choose that word ? I don’t know. Honestly. 🙂

Read on to see it in context around hosts, CPU, and especially I/O.

Comments closed

Azure SQL Database Startup Time

Published 2021-02-12 by Kevin Feasel

John McCormack has a tip for us:

The traditional methods used for to find the start up time for SQL Server don’t work in Azure SQL DB.
I searched high and low to find this and thought I’ve got to share, and hopefully make it search engine friendly. A traditional google or bing search wasn’t bringing up the best way to find this out. I saw a lot of complicated queries to pull data, convert it and estimate start up time using functions and all kinds of magic.

Click through for the one-liner script.

Comments closed

Hyperscale and RBIO_RG_STORAGE

Published 2021-02-11 by Kevin Feasel

Reitse Eskens runs into a strange bug in Azure SQL Database Hyperscale:

This single wait made sure our complete environment went dead in the water. Everything halted. To get some context, Microsoft has some documentation on this wait:
Occurs when a Hyperscale database primary compute node log generation rate is being throttled due to delayed log consumption at the page server(s).
Well, that’s not really helping, because that’s about everything they tell you about it.

Click through for Reitse’s findings and Microsoft’s advice.

Comments closed

Secure Cluster Connectivity in Azure Databricks

Published 2021-02-08 by Kevin Feasel

Abhinav Garg and Premal Shah have an announcement:

We’re excited to announce the general availability of Secure Cluster Connectivity (also commonly known as No Public IP) on Azure Databricks. This release applies to Microsoft Azure Public Cloud and Azure Government regions, in both Standard and Premium pricing tiers. Hundreds of our global customers including large financial services, healthcare and retail organizations have already adopted the capability to enable secure and reliable deployments of the Azure Databricks unified data platform. It allows them to securely process company and customer data in private Azure Virtual Networks, thus satisfying a major requirement of their enterprise governance policies.

Read on fore more detail about how this works.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Category: Cloud

Analyzing XGBoost Training Reports

Using Terraform to Tag Created Date

A Warning on Using Distributed Network Names

Azure Data Factory Pipeline Outcomes

Azure Data Factory and JSON Array Hand-Offs

Installing SQL Server on an Azure VM

Availability Groups and the Shakes

Azure SQL Database Startup Time

Hyperscale and RBIO_RG_STORAGE

Secure Cluster Connectivity in Azure Databricks