Press "Enter" to skip to content

Author: Kevin Feasel

The Logging Costs of DROP TABLE and TRUNCATE

Paul Randal explains that DROP TABLE and TRUNCATE TABLE are logged operations:

Hopefully you all know that it’s a myth that DROP TABLE and TRUNCATE TABLE are non-logged operations. If you didn’t know that, read my blog post on that explains about the deferred drop mechanism. Both operations are fully logged, and will generate quite a bit of transaction log.

The bulk of the log that’s generated comes from having to log the deallocation of extents and the pages within them. For each extent, a bit must be cleared in the corresponding GAM page and IAM page, and all 8 pages in the extent must be marked as deallocated in the corresponding PFS page (turning off the 0x40 bit in each PFS byte). So that’s three log records per allocated extent.

To get a feeling for how much that is, Paul provides an example of a 20TB table being dropped.

Leave a Comment

Using Terraform to Tag Created Date

John Martin has an interesting use case for tagging in Terraform:

One of the key properties missing from Azure resources, in my opinion anyway, is a CreatedDate. This can be largely overcomes with Azure policy, but what if you don’t have access to create one that applies a timestamp tag at resource creation?

It is possible to use Terraform to tag the resource and set the value for when the resource is created. There is a little more work that needs to go into it to ensure that once it is set that Terraform does not overwrite it on subsequent deployments. But, it is achievable and brings this into your control if needed.

Click through to see how.

Leave a Comment

Azure Synapse Pathway

John Macintyre announces a new product:

Azure Synapse Pathway connects to the source system and inspects details about your database objects. An assessment report captures further details on the database objects that can be translated into Azure Synapse Analytics. With Azure Synapse Pathway the source database objects are automatically converted and optimized to T-SQL code on Azure Synapse Analytics. This means your existing code, whether a thousand or million lines of code, will be converted by Azure Synapse Pathway.

As a result of these capabilities, the traditional process of manual code conversion can now be automated in a fraction of the time; all while cutting out manual errors and reducing the total cost of the migration.

They’re starting with a few data sources (including Snowflake), but it’s an interesting product. I could see it useful for getting 80-85% of the migration done, though I don’t trust auto-generated code to be optimal.

Leave a Comment

Batch Mode on Row Store in SQL Server 2019

Deepthi Goguri looks at a nice performance improvement in SQL Server 2019:

In the previous post, we learned about Table variable deferred compilation. In this blog, lets focus on the batch mode on rowstore feature introduced in SQL Server 2019. This feature improves the performance of the analytical queries using the batch mode query processing. This feature is for CPU optimization helping analytical queries to run faster. We do not have to specify this option if the database compatibility is 150.

This feature is especially for the analytical queries for CPU bound analytic workloads without needing the columnstore indexes. We can specifically mention the hints in the query for using the batch mode or not.

There are specific rules which must be met before it kicks in, but the performance benefit can be significant. If you’re running SQL Server 2017, you needed a columnstore index on a table to get batch mode, though there is a trick around this: you can create a filtered, nonclustered columnstore index WHERE 1=0 so that it doesn’t have any rows. Then, any queries which hit that table are potentially eligible for batch mode processing, even though none of them use the columnstore index.

Leave a Comment

Creating a Database Publish Profile in Visual Studio

Elizabeth Noble shows us how to create a database publish profile using Visual Studio:

One of our fears was always how to prevent losing data and critical data code. Here were publish profiles to our rescue. We also found that some of our database code had specific values depending on the environment or contained references to other databases. Once again, publish could solve these problems!

While I’d love to say that you could use ADS to manage your database projects, that just isn’t true right now. However, we have a way to help you get a publish profile created. If you don’t want to use Visual Studio yourself, you might want to ask your Developer friends real nice and see if they’d be willing to help you out.

Click through for a video and a sample of what a publish profile looks like.

Leave a Comment

Applied ML Prototypes

Alex Bleakley and Santiago Giraldo announce Applied ML Prototypes:

To directly address these challenges, we’ve released Applied ML Prototypes (AMPs) — a revolutionary new way of developing and shipping enterprise ML use cases — which provide complete ML projects that can be deployed with one click directly from Cloudera Machine Learning. AMPs enable data scientists to go from an idea to a fully working ML use case in a fraction of the time, with an end-to-end framework for building, deploying, and monitoring business-ready ML applications instantly. 

AMPs move the starting line for any ML project by enabling data scientists to start with a full end-to-end project developed for a similar use case, including a trained and deployed ML model, as well as prebuilt predictive business applications, out of the box. This means that ML development teams can tackle their own ML business use cases more quickly, from those involving churn modeling, to sentiment analysis, to anomaly detection and beyond.

Getting past the marketing fluff, there are some interesting ideas here.

Leave a Comment

Installing Spark on Windows Subsystem for Linux

David Alcock wants Spark, but not Windows Spark:

The post won’t cover any instructions for installing Ubuntu and instead I’ll assume you’ve installed already and downloaded the tgz file from the Apache Spark download page (Step 3 in the above link).

Let’s go straight into the terminal window and get going! I’ve put the commands in bold text (don’t include the $) just so anyone can see a bit easier and who also prefers to ignore my jibberish! 

Click through for the instructions.

Leave a Comment

Tracking Log Rolling in SQL Server

Andrea Allred wants to figure out how frequently logs roll over:

In January, the awesome Tim Radney (b|t) talked to the Utah user group about best practices. One that he mentioned was rolling over your error logs everyday and keeping 35 logs (a month plus 3 reboots). I loved this idea and implemented it using what I had done here and adding it to an agent job.

Then I realized we didn’t have any alerts on if our logs were rolling too much. Way back in my career, it used to be something that I would watch and it could mean someone was trying to hack your system and cover their tracks by rolling your logs over a bunch. I fought so much with figuring how to tell if my logs are rolling over, I had to save it for the future.

Click through for a T-SQL solution to the problem.

Leave a Comment

Combining SendTo and Powershell

Mark Wilkinson shares a script with us:

If you are not familiar, SendTo options are those available when you right click on a file/folder in file explorer and select the Send To option in the menu. When you use this option, the currently selected files/folders are passed to the SendTo shortcut as a space delimited list of files and folders. This is important to know so you better understand what needs to be done to read that list.

I can confirm that this works well for deploying script out, especially when they need to go to multiple servers or multiple databases on servers. That functionality takes a bit more effort to write, but combine Mark’s code with Jess’s and you are well on your way.

Leave a Comment

Azure Charts

This is an interesting site I just learned about this morning:

Project mission: Communicating Azure’s current state, structure and updates in a compact digestible way.

Concept: Cloud Charts Manifesto

Data sources: Public updates, RSS channels and web pages are used as primary data sources.

While developed by a Microsoft employee, Azure Charts is not a Microsoft service or product. This is a personal project, there are no implicit or explicit obligations related to it. If you want to receive updates about Azure Charts, feel free to follow or connect on LinkedIn.

It’s out of the norm for me to talk about a whole site like this, but I think it’s a really useful way of showing off these services in a non-confusing manner. It also gives some en passant insight on the relative interest in specific services based on industry.

If you do think this is interesting, please fill out a small form to make sure it stays public.

Leave a Comment