Press "Enter" to skip to content

Category: Cloud

Overly Large Executors in ElasticMapReduce

Dmitry Tolpeko notes a change to Amazon ElasticMapReduce:

So 50 executors were initially requested with the required memory 22528 and 4 vcores as expected, but actually 9 executors were created with 112640 memory and 20 cores that is 5x larger. It should have created 10 executors but my cluster does not have resources to run more containers.

Note: The second log row specifies allocated vCores:5, it is because of using DefaultResourceCalculator in my YARN cluster that ignores CPU and uses memory resource only. Do not pay attention to this, the Spark executor will still use 20 cores as it reported in the third log record above.

Click through for the reason.

Comments closed

Log Truncation Improvements in SQL Managed Instance Backup Restorations

Niko Neugebauer reports on some performance improvements:

There are multiple phases of the SQL MI database restore process, which, aside from the usual SQL Server restore operations, includes registering databases as an Azure asset for visibility & management within the Azure Portal. Additionally, the restore process forces a number of specific internal options, and some property changes such as forcing the switch to the FULL recovery model and forcing the database option PAGE_VERIFY to CHECKSUM, as well as eventually performing a full backup to start the log chain and provide full point-in-time- restore options through the combination of full and log database backups.

The restore operation on SQL MI also includes log truncation, and the execution time for the truncation has been vastly improved, which means that customers can expect their entire database restore process to become faster on both service tiers.

Click through to see what kind of performance gains you can expect.

Comments closed

Automating Bring-Your-Own-Key Rotation for TDE in Azure SQL DB

Shoham Dasgupta announces a new preview program:

Transparent data encryption (TDE) in Azure SQL Database and Managed Instance helps protect against the threat of malicious offline activity by encrypting data at rest.  TDE with Customer-Managed Key (CMK) enables Bring Your Own Key (BYOK) scenario for data protection at rest, by allowing a key stored in a customer-owned and customer-managed Azure Key Vault to be used as the TDE Protector on the server or managed instance.

When using TDE with Customer-Managed Key, one of the important responsibilities that customers need to perform on a regular basis is key rotation, that is, rotating the TDE Protector on the server by switching to a new key (or new version of the earlier key) from Azure Key Vault. Key rotation is a critical activity for an organization that is required to meet security and compliance objectives.

Automated key rotation for Azure SQL Database and Managed Instance is now available in preview, simplifying key management responsibilities for customers.

Click through to see how this works.

Comments closed

Max Server Memory and AWS

Andrea Allred runs into a weird issue on AWS RDS:

We tracked down every job that was touching the server and started to tune it, thinking that was just pushing us over the edge. We worked with AWS and finally one of our engineers noticed that our MAX Server Memory Setting was back at the SQL default. You know that insane default? Yes, it was there. But we had properly set that…three months ago when this stack was put in place.

Click through for the entire story, including symptoms and resolution.

Comments closed

Overview of Arc-Enabled SQL Managed Instances

Warwick Rudd continues an overview of Azure Arc-Enabled Data Services:

In our previous post, we mentioned the 2 types of data services that are supported and able to be managed by our newly deployed Data Controller:

– Azure Arc-enabled SQL Managed Instance

– Azure Arc-enabled PostgreSQL Hyperscale

In this pose we are going to have a look at the differences between an installation of Azure SQL Managed Instance and Azure Arc-enabled SQL Managed Instance.

This post doesn’t cover the actual deployment; Warwick promises that in his next post.

Comments closed

Azure Synapse Link for SQL Server 2022 and File Analysis

Kevin Chant digs into Azure Synapse Link for SQL Server 2022:

In this post I want to cover some file tests for Azure Synapse Link for SQL Server 2022 that I performed.

Because a while back I spotted something interesting whilst I was doing some initial tests for Azure Synapse Link for SQL Server 2022.

Which is when you add new data after the initial load that a new folder called ‘ChangeData’ appears in the storage account container. I noticed that the new file containing the insert was a comma separated value (csv) file. Whereas the table used for the initial load was a parquet file.

Is there a method to this madness? Click through to see Kevin’s tell-all story.

Comments closed

Continuing Arc-Enabled Data Services

Warwick Rudd continues a series on Azure Arc-Enabled Data Services. Part 5 takes us through what you can do with the Azure CLI:

In our previous post, we touched on the deployment of the Data Controller and being able to deploy via the Portal, Azure Data Studio, or CLI commands depending on whether you are implementing a directly or indirectly connected Data Controller.

Az Arcdata is a suite of CLI commands that allow command line management of the data controller and the Arc-enabled SQL Managed Instance once we have it configured.

Part 6 details the services available today:

Azure data services such as Azure SQL Managed Instance and Azure PostgreSQL are fully managed by Microsoft in the Azure Cloud. They provide you with evergreen environments because they are managed by Microsoft and always have the latest patches and feature offerings, while also providing you the ability to quickly and easily scale on demand based on the workload or requirements.

I do expect this set to grow over time.

Comments closed

Optimizing Azure Pricing for Storage and VMs

Shane Baldacchino continues a series on cost optimization in the cloud:

Cost. I have been fortunate to work for and help migrate one of Australia’s leading websites (seek.com.au) in to the cloud and have worked for both large public cloud vendors. I have seen the really good, and the not so good when it comes to architecture.

Cloud and cost. It can be quite a polarising topic. Do it right, and you can run super lean, drive down the cost to serve and ride the cloud innovation train. But inversely do it wrong, treat public cloud like a datacentre then your costs could be significantly larger than on-premises.

Click through for some good advice, including an appreciation of spot instances.

Comments closed

A Primer on Azure Arc-Enabled Data Services

Warwick Rudd has a four-parter on Azure Arc-Enabled Data Services. Part 1 sets the stage:

Utilising Azure Arc-enabled data services provides you the ability to take advantage of the Azure data services (SQL Server, Azure SQL Managed Instance, PostgreSQL) in a hybrid environment. This offering provides you with reduced administrative efforts in managing and maintaining your data services while giving you the same look and feel as if you were running in the Azure Cloud.

Part 2 looks at the Data Controller:

The Azure Arc Data Controller is a Kubernetes operator that performs all of the orchestration to ensure you achieve your desired state. This is the main component in the Azure Arc infrastructure that links the data services with the Arc-enabled hardware located either in your On-premises, Azure, or any other public cloud data center and your azure subscription.

The Arc data controller allows you to deploy, manage, secure, and monitor your deployed data services estate using Azure Data Studio or the Azure Portal (only for directly connected mode deployments) but giving you the same experience as if you were managing your data services from inside of the Azure Portal.

Part 3 deploys a Data Controller:

As previously mentioned there are 2 types of deployment available for your Arc Data Controller. In this post, we are going to have a look at deploying in the Arc Data Controller using the directly connected mode.

For a directly connected Arc Data Controller, we have direct connectivity to our Azure subscription. With this in mind, there are several options as we previously discussed on how to deploy the data controller. For this post, we are using the portal deployment method.

Finally, Part 4 covers management options:

With ADS open and running you can create connections to Arc Data Controllers the same as you can with Instances of SQL Server. In ADS we have under the connections area a section specific for Arc Data Controllers.

Check out all four posts.

Comments closed

Migrating Databases between SQL Managed Instances

Etienne Lopes performs a migration:

In this post I’m going to show a very simple way to migrate a database between two SQL Server managed Instances in Azure. I’m not a big fan of bacpac files (although I work with it when necessary) so I’ll use a different approach here. Besides, when creating a bacpac file using SSMS there are some schema validations that occur at the beginning that will abort the bacpac generation for example if the database holds three-part names inside stored procedures. While not supported in SQL Azure DB it is supported in SQL Managed Instances (as are cross-database queries), and it can be quite frustrating to experience this show stopper when using bacpac’s to migrate or copy databases between Managed Instances.

Click through for the demo. And yeah, I’ve run into limiting factors with bacpacs, such as having certificates for encrypting data (even if you back those up separately).

Comments closed