Cross-Container Transactions With Memory-Optimized Objects

Ned Otter continues his series on In-Memory OLTP isolation levels:

Why will it fail?

It will fail because the initiation mode of this transaction is not autocommit, which is required for READ COMMITED SNAPSHOT when referencing memory-optimized tables (the initiation mode is explicit, because we explicitly defined a transaction).  So to be totally clear, for queries that only reference memory-optimized tables, we can use the READ COMMITTED or READ COMMITTED SNAPSHOT isolation levels, but the transaction initiation mode must be autocommit. Keep this in mind, because in a moment, you’ll be questioning that statement….

There are some interesting implications that Ned teases out, so I recommend giving it a careful read.

Memory-Optimized Object Isolation Levels

Ned Otter looks at the isolation levels offered when you work with memory-optimized objects:

If you are only querying on-disk tables, you can use any of the isolations levels from List 1. And if you are only querying memory-optimized tables, you can use any of the isolation levels from List 2.

But what if you want to reference both on-disk and memory-optimized tables in the same query? Of course, the answer is “it depends”, with transaction initiation modes and isolation levels being the components of that dependency.

This post is part one of a series and is mostly around level-setting.

Temporal Memory-Optimized Tables

Ned Otter describes how hybrid disk + memory-optimized temporal tables differ from on-disk temporal tables:

As changes are made to rows in the temporal memory-optimized table, before being transferred to the history table on disk, they are first migrated to an internal memory-optimized staging table. That means when you query the “history table”, you could be retrieving rows from both the on-disk history table, and internal staging table. Because no custom indexing was possible on the internal staging table, there could be performance implications when executing queries against historical data. Microsoft addressed these potential performance issues in SQL 2016 SP1 (detailed in this CAT blog post).

The internal staging table only gets flushed to the on-disk history table when it reaches 8% of the size of the temporal table. Given the current capacities of Windows Server 2016 (24TB memory), it’s now possible to have very large memory-optimized tables. 8% of one of those large memory-optimized tables could be quite large, which will affect query performance, if predicates don’t match available indexes.

Read on for some sobering thoughts on the topic.

Schema-Only Optimized Tables Can Still Roll Back

Chris Adkin investigates whether schema-only memory-optimized tables are logged and whether they support transactions the way other tables do:

The statement “There is zero logging when DURABILITY=SCHEMA_ONLY” is not factually correct, its more like a minimally logged operation. What is surprising is the fact that logged as advertised for the in-memory engine should result in far fewer log records than the equivalent workload for the legacy engine, clearly this is not the case in this particular example and something I need to dig into somewhat deeper. Also note that the version of SQL Server being used is SQL Server 2016 SP1 CU3, which should be stable. One final point, in order to make sure that fn_dblog and fn_dblog_xtp produced clean results for me each time, I took the quick and dirty option of re-creating my test database each time.

This post definitely ranks in the “Microsoft did this right” category.

Partitioned Views With Memory-Optimized Tables

Ned Otter ran into an issue building partitioned views from a combination of disk-based and memory-optimized tables:

Let’s assume that we have two tables that belong to a partitioned view. Both tables can be memory-optimized, or one table can be memory-optimized, and the other on-disk.

Success condition

an UPDATE occurs to a row in a table, and the UPDATE does not change where the row would reside, i.e. does not cause it to “move” to another table, based on defined CONSTRAINTS

Failure conditions:

   a. UPDATE occurs to a row in the memory-optimized table that causes it to move to either another memory-optimized table, or a on-disk table

   b. UPDATE occurs to a row in the on-disk table that causes it to move to the memory-optimized table

Read the whole thing.  The good news is that if you’re splitting in a way that doesn’t require updating from one type of table to the other, partitioned views work fine.

When You Need To Read Memory-Optimized Data From Disk

Ned Otter enumerates the scenarios in which SQL Server needs to read data from disk for memory-optimized tables:

Those who have studied In-Memory OLTP are aware that in the event of “database restart”, durable memory-optimized data must be streamed from disk to memory. But that’s not the only time data must be streamed, and the complete set of events that cause this is not intuitive. To be clear, if your database had to stream databack to memory, that means all your memory-optimized data was cleared from memory. The amount of time it takes to do this depends on:

  • the amount of data that must be streamed

  • the number of indexes that must be rebuilt

  • the number of containers in the memory-optimized database, and how many volumes they’re spread across

  • how many indexes must be recreated (SQL 2017 has a much faster index rebuild process, see below)

  • the number of LOB columns

  • BUCKET count being properly configured for HASH indexes

Read on for the list of scenarios that might cause a standalone SQL Server instance to need to stream data from disk into memory to re-hydrate memory-optimized tables and indexes.

Migrating To In-Memory OLTP

Erin Stellato is kicking the hornet’s nest again; this time it’s about In-Memory OLTP:

In the past few months I’ve had several clients reach out about migrating to In-Memory OLTP solutions.  When the second or third request rolled in I remember thinking, “Finally!”  As in, I’ve been wondering why businesses haven’t been looking to implement In-Memory sooner.  In fact, I added a section on In-Memory to our IEPTO2 course because with the availability of the feature in SQL Server 2016 SP1 (yes, there are memory limitations, but it’s a good place to start) I figured we would see an uptick in interest.  But here we are, half way through 2017 and over 6 months since the release of SQL Server 2016 SP1, and I still see a lot of hesitation around it.

I wrote a post over on SQLPerformance last week, Testing DML Statements for In-Memory OLTP, and that generated some discussions on Twitter.  So I figured it was time for a post to find out what’s holding companies back.  This isn’t a true poll – it’s a fill-in-the-blank.  As in: post a comment.  If you have considered migrating to In-Memory and then didn’t, I want to understand why.  I recognize there are limitations – the product is still new and it’s evolving.  But perhaps if we understand the largest inhibitors to migration we can help move them up on Microsoft’s list via Connect and other pathways.  Understand I am asking for specifics here, for example: we can’t use In-Memory tables because they don’t support spatial data types.  Whatever the reason, share it via a comment.

If I were to take a wild guess, the most common answers will be something like:

  1. Not using SQL Server 2014 EE or later, or any edition of 2016 SP1 or later
  2. Limitations in what memory-optimized tables provide:  can’t go cross-database, can’t create useful constraints, etc.
  3. Syntax troubles, particularly in 2014:  no outer joins, etc.
  4. Difficulties fitting this into a legacy system:  it’s not just as simple as drop-and-replace tables due to limitations above.  Also, due to size limits (none of that NVARCHAR(MAX) business), candidate tables might need to be broken up or restructured so that they fit the mold.

I like using memory-optimized tables where I can, but have had much more success with memory-optimized table-valued parameters.

Comparing Memory-Optimized Versus On-Disk Performance

Erin Stellato has a performance comparison between disk-based and memory-optimized tables:

I developed the following test cases:

  1. A disk-based table with traditional stored procedures for DML.
  2. An In-Memory table with traditional stored procedures for DML.
  3. An In-Memory table with natively compiled procedures for DML.

I was interested in comparing performance of traditional stored procedures and natively compiled procedures, because one restriction of a natively compiled procedure is that any tables referenced must be In-Memory. While single-row, solitary modifications may be common in some systems, I often see modifications occurring within a larger stored procedure with multiple statements (SELECT and DML) accessing one or more tables. The In-Memory OLTP documentation strongly recommends using natively compiled procedures to get the most benefit in terms of performance. I wanted to understand how much it improved performance.

Read on for the results.

Finding Candidates For Memory-Optimized Tables

Ned Otter points out a very interesting report in SSMS 2016 and 2017, which helps you determine if you should migrate a table to be memory-optimized:

The chart attempts to display both the best candidates and the degree of difficulty for migration. But there are a few problems with the “difficulty” rating of this internal query, and as a result, if we really want to know how easy/difficult the process might be, we’ll have to roll our own query.

Read on for more details, as well as a script Ned has put together to do the same in T-SQL.

Reducing TempDB Usage With Memory-Optimized Objects

Mark Wilkinson shows how to replace temp tables (or table variables) with memory-optimized table variables to reduce tempdb latching:

If all worked, you should now see that we have contention on the sysschobjs table. Earlier we discussed using sp_help to get index details on system tables, if we do that now and look at index 2, we will see the lead column is nsclass which is a tinyint field. Using a tinyint as a lead column is typically a terrible idea since there is little selectivity on such a narrow field, and this is no exception.

This isn’t the only case of contention you might see with system objects related to temporary tables. We ran into a few different contention scenarios with tempdb:

  • Contention on sysschobjs again, but on index 3. This index leads with the name of the temporary table and is fairly narrow so you can fit a lot of records on a single index page. Because of this, if you are running lots of concurrent procedures that create temporary tables with the same or similar names, it creates a hot spot on a single page, leading to more contention.

  • Temporary table auto-stats. Statistics objects for all tables (including temporary tables) are stored in the sys.sysobjvalues table. If you get enough auto-stats generations on temporary tables you can see contention here.

Mark’s post reads like a book chapter and he does a great job of summing up the problem and the solution.

Categories

August 2017
MTWTFSS
« Jul  
 123456
78910111213
14151617181920
21222324252627
28293031