Press "Enter" to skip to content

Curated SQL Posts

Forcing a Shrink of TempDB

Monica Rathbun has a script for people who just want to watch the world burn:

Occasionally, we must resize or realign our Tempdb log file (.ldf) or data files (.mdf or .ndf) due to a growth event that forces the file size out of whack. To resize we have three options, restart the SQL Server Service, add additional files, or shrink the current file. We most likely have all been faced with run away log files and in an emergency situation restarting the SQL Services may not be an option but we still need to get our log file size smaller before we run out of disk space for example. The process of shrinking down that file can get tricky so I created this flow chart to help you out if you ever get into this situation.

Now its very important to note that many of these commands will clear your cache and will greatly impact your server performance as it warms cache backup. In addition, you should not shrink your database data or log file unless absolutely necessary. But doing so,  it can result in a corrupt tempdb.

Click through for the process, including a well-made flow chart with a large “Know what you’re doing before doing this” warning.

Comments closed

How SQL Server Stores XML Data

Randolph West continues a series on internal storage of data types:

The SQL Server XML data type supports both schema-defined and schema-less XML documents (typed and untyped), and partial documents. Partial documents are still considered well-defined XML, but don’t have surrounding <?xml>...</xml> tags for example. Either way, the storage engine validates the data when inserting or updating into an XML column.

The XML data type lets us store up to 2GB per column per row, but it’s prudent here to note that just because we can, it doesn’t mean we should.

Click through for the details.

Comments closed

GetAllTheErrorLogs: Combining Multiple Log Sources

Aaron Bertrand has a new project:

There’s a whole lot of grunt work in there that shouldn’t have to be done by a person. I don’t think you could automate the whole thing, because it is hard to predict exactly what events will be interesting and not, but I think 90% is achievable.

A colleague mentioned that they want to build something that would help, but even when that happens, that would up in proprietary code that only helps us. I saw Drew Furgiuele’s post on Building a Better Get-SQLErrorLog, and that gave me some ideas for what I would build. After reaching out to Drew, I created a GitHub repository with a working name of GetAllTheErrorLogs. Its elevator pitch is a simple sentence:

Powershell to assemble a timeline combining salient events from Windows Event Log, Failover Cluster log, and SQL Server errorlog.

Click through for the details as well as Aaron’s current progress.

Comments closed

Recursive Common Table Expressions

Itzik Ben-Gan continues a series on table expressions:

Recursive CTEs have many use cases. A classic category of uses has to do with handling of graph structures, like our employee hierarchy. Tasks involving graphs often require traversing arbitrary length paths. For example, given the employee ID of some manager, return the input employee, as well as all subordinates of the input employee in all levels; i.e., direct and indirect subordinates. If you had a known small limit on the number of levels that you might need to follow (the degree of the graph), you could use a query with a join per level of subordinates. However, if the degree of the graph is potentially high, at some point it becomes impractical to write a query with many joins. Another option is to use imperative iterative code, but such solutions tend to be lengthy. This is where recursive CTEs really shine—they enable you to use declarative, concise and intuitive solutions.

I’ll start with the basics of recursive CTEs before I get to the more interesting stuff where I cover searching and cycling capabilities.

Read the whole thing. There are some good uses of recursive common table expressions (I used them to figure out nested foreign key relationships, for example), so it’s worth your time.

Comments closed

Automating Database Restorations

Hugo Kornelis walks us through automated restoration of database backups:

Now I’ve been to quite a few conferences. And I’ve heard a lot of DBAs talk about best practices. One of the constants in those talks is: automate your restores. So I felt confident that a quick internet search would surely be enough to find me an existing script for restoring a database. Sure, I’d need to modify it to restore to a test database, but that should be minimal effort.

To my surprise, I was unable to find a script. Is scripting this so easy that every DBA can do it with their eyes closed, and nobody feels a need to share it? Is it so hard that they all decided they’re sitting on gold and will only share it for big money? Or were my internet search skills simply severely lacking?

Anyway, bottom line is that I had to do it myself. And I’ll share the result in this post.

Click through for the script and a detailed explanation of how it works.

Comments closed

Automation Tips for ETL

Richard Swinbank shares some tips around automating ETL processes:

I spent a good part of my career in the UK National Health Service (NHS). We often talk about data engineering problems in terms of handling large volumes of data, but the challenge in the NHS was frequently complexity – even a small acute hospital might have a dozen or more separate systems, running on various different DBMS backends, with many tables of interest for management or central reporting.

You might need to extract data from hundreds of (mostly small) tables to populate your data warehouse – my challenge was to make this as quick and easy to implement as possible.

Click through for the tips.

Comments closed

ACID Transactions with Hive LLAP in ElasticMapReduce

Suthan Phillips and Chao Gao walk us through ACID transactions when using Hive on Amazon’s ElasticMapReduce platform:

ACID (atomicity, consistency, isolation, and durability) properties make sure that the transactions in a database are atomic, consistent, isolated, and reliable.

Amazon EMR 6.1.0 adds support for Hive ACID transactions so it complies with the ACID properties of a database. With this feature, you can run INSERT, UPDATE, DELETE, and MERGE operations in Hive managed tables with data in Amazon Simple Storage Service (Amazon S3). This is a key feature for use cases like streaming ingestion, data restatement, bulk updates using MERGE, and slowly changing dimensions.

This post demonstrates how to enable Hive ACID transactions in Amazon EMR, how to create a Hive transactional table, how it can achieve atomic and isolated operations, and the concepts, best practices, and limitations of using Hive ACID in Amazon EMR.

Click through for a demonstration.

Comments closed

Cloudera’s Not Dead Yet

Alex Woodie shares some good news about Cloudera:

Things are starting to look up for Cloudera, which beat analyst expectations with its second quarter results announced yesterday. The distributed computing platform maker also gave investors something to cheer about with an optimistic financial forecast for the rest of fiscal 2021.

Cloudera, which trades on the New York Stock Exchange under the symbol CLDR, reported a non-GAAP profit of $0.10 per share for the second quarter of fiscal year 2021 ended July 31, exceeding analyst expectations by three cents. A year ago, it reported a non-GAAP loss of $.02 per share a year ago.

The company isn’t in great shape, but this is a good sign.

Comments closed

Automate sp_whoisactive Runs

John McCormack shows how to save the output from sp_whoisactive for later research:

When I took over a fairly troublesome SQL Server, I was always being asked what was running when the SQL Server was slow. That was fine when it was during the day and I could look. But often I would be asked, why was the server slow at 9pm last Friday? Or some other time. It’s not always the easiest thing to do as SQL Server doesn’t natively keep this information for you.

A quick but effective way to automate this is to download and schedule Adam Machanic’s sp_whoisactive stored procedure. I run this every 10 minutes but I have seen it scheduled as often as every 30-60 seconds. It depends on your system and your needs.

This is in place where I work, and I’ve put it into place when consulting for companies without the budget for fancy tools. I really appreciate that Adam Machanic made it a simple option to insert results into a table. Also, John has a step to delete older data, which is critical.

Comments closed

Automating dbachecks Runs

Jess Pomfret automates runs of dbachecks via scheduled task:

To automate the running of our daily checks we’ll first create a PowerShell script and then schedule that using task scheduler.  If you have other enterprise scheduling tools available you could easily use those instead to invoke the PowerShell script.

The script for my example, shown below, is pretty simple. I have a section to define where the data will be stored (the ability to save dbachecks result information straight into a database was introduced with dbachecks 2.0 and so I would highly recommend updating if you’re on an earlier version).

Click through for the full explanation and code.

Comments closed