Press "Enter" to skip to content

Author: Kevin Feasel

When the Transaction Begins

Josh Darnell points out that transactions don’t really begin with BEGIN TRAN:

Of course, no one reads the documentation. And even if they do, they certainly don’t continue on to the 4th paragraph of the “General Remarks” section. That’s like going to the second page of Google search results.

If one did trudge on through the docs, they would find this gem:

Although BEGIN TRANSACTION starts a local transaction, it is not recorded in the transaction log until the application subsequently performs an action that must be recorded in the log, such as executing an INSERT, UPDATE, or DELETE statement. An application can perform actions such as acquiring locks to protect the transaction isolation level of SELECT statements, but nothing is recorded in the log until the application performs a modification action.

Well that’s interesting! Let’s check it out.

Four paragraphs? Where’s the tl;dr, Microsoft Docs?

Read the whole thing even though Josh just made a joke about nobody reading the whole thing and I followed along with it.

Comments closed

Contrasting Common Table Expressions and Temp Tables

Brent Ozar has some advice on when to use common table expressions versus temporary tables:

I’d suggest starting with CTEs because they’re easy to write and to read. If you hit a performance wall, try ripping out a CTE and writing it to a temp table, then joining to the temp table.

This is my advice, too. Start with the thing which is easiest for you to develop and maintain. If it suffices for performance, stick with it; otherwise, move to the next-lowest level of complication. Stop when you have good enough performance. This optimizes for one of the most precious resources people rarely think about: developer maintenance time. Developers are pretty expensive, so the more time they spend trying to understand complex code, the less time they’re doing stuff which pushes the business forward.

Comments closed

Problems With Renaming SQL Server

Jamie Wick goes through some of the problems you might experience when renaming a SQL Server instance:

Monitoring Agents – The first (and easiest) thing to be fixed were the monitoring agents (ie. MS Systems Center Operations Manager). All that took was a quick Redeploy and they came back online.

SQL Agent Jobs – The next issue was SQL Agent Jobs. Many of the maintenance jobs (eg. syspolicy_check_schedule_xxxx) have the server name hardcoded into them. This can be easily remediated by editing the job step properties and replacing the old server name with the new one.

Read on for more problems you might run into. None of these is insurmountable but they are annoying enough that I wouldn’t want to rename servers willy-nilly.

Comments closed

HBase Updates in CDH 6.2

Krishna Maheshwari announces updates to the Cloudera Distribution of Hadoop:

Starting with CDH 6.2, Cloudera now includes the ability to use Intel’s newly released Optane Memory as an alternate destination for the 2nd tier of the bucket cache.  This deployment configuration enables you to have ~3x the size of the cache for constant cost (as compared to off-heap cache on DRAM). It does incur some additional latency compared to the traditional off-heap configuration, but our testing indicates that by allowing more (if not all) of the data’s working set to fit in the cache the set up results in a net performance improvement when the data is ultimately stored on HDFS (using HDDs).

When deploying to the cloud or using on-prem object storage, the performance improvement will be even better as object storage tends to be very expensive for random reads of small amounts of data.

There aren’t too many changes to HBase in the blog post, but the two mentioned are pretty good ones.

Comments closed

Connecting PolyBase to Spark

I have a blog post connecting PolyBase to a Spark cluster:

If you do define your Spark DataFrames well, you get a much happier result. Here’s me creating a better-looking DataFrame in Spark:

import org.apache.spark.sql.functions._
spark.sql("""
SELECT
INT(SUMLEV) AS SummaryLevel,
INT(COUNTY) AS CountyID,
INT(PLACE) AS PlaceID,
BOOLEAN(PRIMGEO_FLAG) AS IsPrimaryGeography,
NAME AS Name,
POPTYPE AS PopulationType,
INT(YEAR) AS Year,
INT(POPULATION) AS Population
FROM NorthCarolinaPopulation
WHERE
POPULATION <> 'A'
""")
.write.format("orc").saveAsTable("NorthCarolinaPopulationTyped")

It’s not all perfect, though: I also cover driver problems that I ran into here with Spark and Hive.

Comments closed

Finding High-Variance Memory Grants

Erin Stellato shows how you can use Query Store to track the variance of memory grant requests:

One of the more perplexing problems to troubleshoot in SQL Server can be those related to memory grants. Some queries need more memory than others to execute, based on what operations need to be performed (e.g. sort, hash). SQL Server’s optimizer estimates how much memory is needed, and the query must obtain the memory grant in order to start executing. It holds that grant for the duration of query execution – which means if the optimizer overestimates memory you can run into concurrency issues. If it underestimates memory, then you can see spills in tempdb. Neither is ideal, and when you simply have too many queries asking for more memory than is available to grant, you’ll see RESOURCE_SEMAPHORE waits. There are multiple ways to attack this issue, and one of my new favorite methods is to use Query Store.

Click through for a demonstration.

Comments closed

Business Cases for SQL Server 2019

Kevin Chant gives you business cases for why you should upgrade to SQL Server 2019:

1. Polybase
Currently, you might import large amounts of data from another relational database system like Oracle or MongoDB into your SQL Server database. If so, you can use Polybase instead.

Polybase has certainly been improved in SQL Server 2019. For instance, more data sources are now available. Plus, you can even install Polybase on SQL Server 2019 on Linux as well.

Instead of importing the data from the source you can use Polybase to connect to it remotely. Which means that you can run your queries against the source directly instead.

Read on for more in this vein. Here it’s less about the technical capabilities and more about making life easier for other people in the business.

Comments closed

CAST and CONVERT Make Expressions Nullable

Daniel Hutmacher points out a side effect of using CAST() and CONVERT():

Suppose we want to set up a view in the new solution that mirrors the names and definitions of the old table, so the legacy integration can use that view going forward:

CREATE OR ALTER VIEW new.the_table_like_before
AS
SELECT CAST(id AS varchar(32)) AS id,
CAST([row] AS int) AS [row],
CAST(date_loaded AS datetime) AS dt
FROM new.the_table;

Now, if you check out the resulting datatypes of the view, you’ll notice that all the columns are marked nullable, even though they’re all based on non-nullable columns, so the values in the view could never be null.

Read on for a couple possible solutions.

Comments closed

Wireshark and the SQL Server Browser Service

Joy George Kunjikkur takes a look at the SQL Server Browser service using Wireshark:

Normally on development environment the 1434 port would be open and SQL Server Browser running in SQL Server machine. If developers code based on that assumption, it may fail in higher environments. Most of the developers know what might have happened. But some may not and ends up in complete darkness. The symptom of this issue will be connection failure from application to SQL Server database.

This post is mainly to demonstrate how to check network failures on SQL Server Browser operations. The prerequisite is the availability of Wireshark in the machine where application runs and basic knowledge how to run the same. The above link has some screenshots which would help to run Wireshark for UDP connections.

This is usually the first service I disable post-installation.

Comments closed

Antivirus and SQL Server

Randolph West proffers advice should your IT team require installing antivirus software on a server with SQL Server running:

This is why it is documented that we should exclude SQL Server from any AV (anti-malware) detection products, so that it can get on with doing what it does best.

Yes, it’s formally documented. This is why we should read documentation when installing things. While it’s super-easy to click “Next,” “Next,” “Next,” that should not be the case with a complex product like SQL Server.

Read on for the list of exceptions you should add and processes to avoid scanning.

Comments closed