SELECT INTO

Kevin Feasel

2016-10-21

T-SQL

Daniel Janik is not a fan of SELECT INTO:

This query for AdventureWorks will dump all of its results into a table named #MyDuplicateCities. Note that there is no CREATE TABLE statement. The INTO [tablename] will create the table for you.

Running this query a second time will result in failure if you haven’t dropped the #MyDuplicateCities table.

Using this syntax can be really helpful if you just need to do some quick and dirty cleanup; however, it should be avoided for stored procedures. Here’s why…

There are some trade-offs here and good arguments either way.  The comments tend to take the pro approach, so they’re worth reading as well.

Database Snapshot Creation History

Paul Randal shows how to read the master transaction log to find when database snapshots were created:

Earlier today someone asked on the #sqlhelp Twitter alias if there is a history of database snapshot creation anywhere, apart from scouring the error logs.

There isn’t, unfortunately, but you can dig around the transaction log of the master database to find some information.

When a database snapshot is created, a bunch of entries are made in the system tables in master and they are all logged, under a transaction named DBMgr::CreateSnapshotDatabase. So that’s where we can begin looking.

Click through for the script and some explanation around it.

Max Data Types In Queries

Erik Darling shows how variable definition can be important, even without implicit conversion:

SQL Server makes many good and successful attempts at something called predicate pushdown, or predicate pushing. This is where certain filter conditions are applied directly to the data access operation. It can sometimes prevent reading all the rows in a table, depending on index structure and if you’re searching on an equality vs. a range, or something else.

What it’s really good for is limiting data movement. When rows are filtered at access time, you avoid needing to pass them all to a separate operator in order to reduce them to the rows you’re actually interested in. Fun for you! Be extra cautious of filter operators happening really late in your execution plans.

Click through for Erik’s demo.

Flushing Change Tracking Internal Tables

Amit Banerjee mentions a new stored procedure for change tracking cleanup:

In SQL Server 2014 Service Pack 2 and above, we provided a new Stored Procedure, sp_flush_CT_internal_table_on_demand, to assist with Change Tracking cleanup. KB3173157 has more details. This stored procedure accepts a table name as parameter and will attempt to cleanup records from the corresponding change tracking internal table.  During the course of the deletion, it will print some verbose in the output window about the progress of deletion.

If automated change tracking cleanup works well enough for you, there’s no change; but if you’re struggling with that cleanup, this procedure might help.

What Is DevOps?

Julia Evans looks at what DevOps means in practice:

I enjoyed reading this article about devops at Etsy. One of the really key things about this article is – there is no devops organization at Etsy. It’s about how developers and operations people work productively together! Also, it was a slow incremental migration towards different practices. They did not wake up one day and become devops. I think this is the first talk that used the term ‘devops’?

It’s also not about “everyone is a software developer” – one of the authors of this book, Katherine Daniels, is a senior operations engineer at Etsy at Etsy. I don’t know any of the details of her job, but my impression is that she has a lot of expertise in operations. It’s not like “make operations so easy that nobody has to an expert at it”. Of course you need people who know a ton about operations! Probably those people write software as part of their job?

One of the scariest realizations that I’m slowly coming to (other than “Information Technology is people!”) is the sheer number of overlapping dependencies in the tech world.  A bit earlier in my career, I felt like I could be “a SQL Server guy” and focus on that while not caring too much about the outside world.  It seems like saying that you want to be “just an X” has become more difficult at the margin, and DevOps is just one example of this:  keeping an edge means going broader about more things while still trying to dig deeper in relevant areas.  That’s a tough balancing act.

RTVS 0.5

Kevin Feasel

2016-10-20

R

David Smith notes that R Tools for Visual Studio has hit version 0.5:

RTVS also makes it easy to run R code as a SQL Server 2016 stored procedure. (This is a great way to make share the results of R code with other database users while making use of the power of the database for R computations.) The new SQL R Stored Procedure file works with SQL Server R Services to create a stored procedure that embeds R code you create, edit and test within RTVS. This greatly simplifies the process of running R code within SQL Server 2016 as you can see below:

The RTVS team is making good progress.  If you passed on RTVS early on, it might be time to take another look.

Phantom Reads

Wayne Sheffield discusses phantom reads:

Run both code scripts again (Code Script 1 first, and Code Script 2 within 10 seconds). This time, you will see that Code Script 2 completes immediately without being blocked, and when Code Script 1 finishes, it has spawned additional data in its result set. A phantom read in action.

If you want to hide from phantom reads completely, then you’ll need to use either the serializable or snapshot transaction isolation levels. Both of these have the same concurrency effects: No dirty reads, non-repeatable reads, or phantom reads. The difference is in how they are implemented: the serializable transaction isolation level will block all other transactions affecting this data, while the snapshot isolation level utilizes row versions to create connection-specific versions of the table for the transaction – all of these row versions will cause increased activity in the tempdb database. Let’s take a look at how the snapshot isolation level will eradicate the phantom reads. First off, we need to modify the database to accept this isolation level.

After reading Wayne’s post, if you want a more academic (i.e., less fun) read, you can also go back to the Microsoft Research isolation levels paper, which describes most of the isolation levels we have in place today for SQL Server.

Temporary Stored Procedures

Jana Sattainathan discusses temporary stored procedures:

The real benefit of these procedures is when they contain lot of logic that you need on a temporary basis but do not want to clutter the existing stored procedure list. You could even have multiple temporary procedures that call each other. I would not go overboard with this. It is just a convenience.

I don’t often see these in use; when I’ve seen them, they’re in environments in which normal stored procedure create rights are locked down and you want to do something as a one-off (like testing an operation against production data).  In other words, those sketchy things that we don’t admit to each other that we do…

BCP For UTF-8

Sanjay Mishra notes that SQL Server 2016 and 2014 SP2 support UTF-8 for BCP and BULK INSERT:

This requirement has been addressed in SQL Server 2016 (and backported to SQL Server 2014 SP2). To test this, I obtained a UTF-8 dataset from http://www.columbia.edu/~fdc/utf8/. The dataset is translation of the sentence “I can eat glass and it doesn’t hurt me” in several languages. A few lines of sample data are shown here:

(As an aside, it is entirely possible to load Unicode text such as above into SQL Server even without this improvement, as long as the source text file uses a Unicode encoding other than UTF-8.)

I ran into this problem before, where developers wanted to bulk load UTF-8 but had to settle for an inferior solution.

Page Free Space On Heaps

Raul Gonzalez shows a case in which writing to a heap can be more efficient than writing to a clustered index:

Lots have been written, including myself, about the advantage and disadvantages of the different types of tables that exist in SQL Server, Clustered and Heaps.

Today I will again, because at the end of the day, a table is the most basic structure in a database and we need them to store the data, which is indeed the purpose of having a database, right?

Interesting reading.  Check it out.

Categories

November 2018
MTWTFSS
« Oct  
 1234
567891011
12131415161718
19202122232425
2627282930