Press "Enter" to skip to content

Category: Architecture

Single-Socket OLTP Systems?

Joe Chang tosses a hardware-related bomb:

Today, it is time to consider the astonishing next step, that a single socket system is the best choice for a transaction processing systems. First, with proper database architecture and tuning, 12 or so physical cores should be more than sufficient for a very large majority of requirements.

We should also factor in that the second generation hyper-threading (two logical processors per physical core) from Nehalem on has almost linear scaling in transactional workloads (heavy on index seeks involving few rows). This is very different from the first generation HT in Willamette and Northwood which was problematic, and the improved first generation in Prescott which was somewhat better in the positive aspects, and had fewer negatives.

Joe knows a lot more about this than I do, but I’m very hesitant about this for two reasons.  First, scale.  When we start looking at hundreds of concurrent requests, would a single-socket machine really work?  I don’t know to answer to that, but in my simplistic “more is better than fewer” rule of thumb, I’d err on the side of caution, especially if it isn’t my money paying for this.

Second, there are batch processes and large background activities which occur even on extremely transactional OLTP systems.  Think about running CHECKDB or ETL processing or troubleshooting/monitoring processes.  These are going to be processes which benefit from parallelism, and if you’re seriously limiting core counts (which a single socket would necessarily do), you might end up in a bad way when they run even if your “normal” workload performs a little better.

Comments closed

Object Naming

Andy Galbraith warns you against…odd…database names:

I went and looked on the server, and sure enough in Management Studio I saw one database named “FinanceDB” and a database named “[FinanceDB]”.

This was on a SQL 2008R2 instance, but as a test I created a database named [test] on my local SQL 2014 instance and sure enough it worked!

The source of the problem at the client was the LiteSpeed maintenance plan.  Even though the backup task was set to backup all user databases, it wasn’t picking up the square-bracketed database.

I’d go a bit further and say that you should avoid everything but alpha-numeric characters and maybe underscore for databases, tables, views, and all other database objects.

Comments closed

Service Broker Architecture

Colleen Morrow gives explanations for various Service Broker components:

Before I jump into the technical details of the Service Broker architecture, I think it helps to have a real-world analogy of what Service Broker is and does.  In the last installment, I used the example of ordering something from Amazon.com.  This time, I’d like to use an analogy that’s somewhat timely: taxes.

Each year, we fill out that 1040 or 1040EZ form and we send it to the Internal Revenue Service.  Maybe we eFile, maybe we mail it in, it doesn’t matter.  That form is received by the IRS and goes into a queue, awaiting review.  At some point, days, maybe weeks later, our tax return is processed.  If all goes well, our return is approved and the IRS cuts us a check.  That is a Service Broker application.

When I first started learning Service Broker, it seemed like there were a lot of abstract notions (mostly because I didn’t know anything about message queues).  The pieces all start to come together once you get into an application.

Comments closed

NUMA With Few Cores

Denny Cherry asks and answers the question of how many NUMA nodes we should use on a server with a large amount of RAM but relatively few cores:

For this example, let’s assume that we have a physical server with 512 Gigs of RAM and two physical NUMA nodes (and two CPU sockets). We have a VM running in that machine which has a low CPU requirement, but a large working set. Because of this we have 4 cores and 360 Gigs of RAM presented to the VM.

The answer is not trivial, making this an interesting question.

Comments closed

Text Search

Anders Pedersen discusses one method he used to implement fast text search in SQL Server:

Looking into what was needed, I quickly realized there was a LOT of data, guess 50+ years of news broadcasts will do this.  Consider this was in the early 2000s, some innovation was needed from anything I had coded before.  Obviously LIKE searches was out of the question, full text search was not available.  So what to do?

Basically I decided to break down each broadcast to words into a separate table, the entire application fit in 2 tables: Story and Words.

This is a case in which thinking about the grain of data helps solve an otherwise-intractable problem.

Comments closed

SQL Server Event Logging

Kendra Little discusses having a reusable event logging tool for your database work:

You can’t, and shouldn’t log everything, because logging events can slow you down. And you shouldn’t always log to a database, either– you can keep logs in the application tier as well, no argument here.

But most applications periodically do ‘heavy’ or batch database work. And when those things happen, it can make a lot of sense to log to the database. That’s where this logging comes in.

Bonus points if you feed this kind of logging into Splunk (or your logging analysis tool of choice) and integrate it with application-level logging.

Comments closed

Junk Dimensions

Jesse Seymour talks about junk dimensions in warehousing:

I think one of the single biggest challenges I face as I attempt to warehouse data that originates as a SharePoint list is the handful of miscellaneous descriptive fields, such as approval status, request status, or something similar.  Typically, this fields are setup as Choice fields in the SharePoint list so they have a known range of values, but its still a pain to have to build a dimension for each one.

Enter the junk dimension.  Ever since I learned about this concept it has made my life so much easier.  What the junk dimension does is perform a cross join against the different fields and creates a row for every possible combination of fields.

Junk dimensions are nice for those low-cardinality attributes which are important but don’t really fit anywhere else.  The important thing to remember about a junk dimension is that you don’t want it to be too large:  if you have 5 attributes, each of which has 8 possible values, you have 8^5 (32,768) rows.  That’s not so bad, but make it 10 attributes and now your table has 1,073,741,824 rows, and that’s a lot of rows for a single dimension.  If you find yourself in that scenario, you might want to create two junk dimensions (bringing you back to 2 dimensions with 32K rows), review your design to see if all those attributes are necessary, or review your design to see if your “junk” dimension is hiding a real dimension.

Comments closed

Logical Data Models

Kevin Kline discusses logical data modeling:

In a recent blog post entitled Is Logical Data Modeling Dead?, Karen Lopez (b | t) comments on the trends in the data modeling discipline and shares her own processes and preferences for logical data modeling (LDM). Her key point is that LDMs are on the decline primarily because they (and their creators) have failed to adapt to changing development processes and trends.

I love all things data modeling. I found data models to be a soothing and reassuring roadmap that underpinned the requirements analysis and spec writing of the Dev team, as well as a supremely informative artifact of the Dev process which I would constantly refer to when writing new T-SQL code and performing maintenance. However, as time has passed, I have been surprised by how far it has fallen out of favor.

This is an interesting discussion.  I’m not sure I’ve ever created a true logical data model.  I’ve worked with systems which could potentially take advantage of them, but they never hit the top of the priority list.

Comments closed

Choosing Between Optimistic Concurrency Levels

Kendra Little has a cheat sheet for comparing the two optimistic concurrency levels:

SQL Server offers two flavors of optimistic locking for traditional disk-based tables: Read Committed Snapshot Isolation (RCSI), and Snapshot Isolation. They are each great tools to reduce blocking and make applications faster, particularly in databases that have lots of tiny reads and writes that need to be quick, or mixed-use patterns (lots of little reads and writes + larger reporting queries).

Both isolation levels are available in SQL Server Standard Edition and Web Edition as well as Enterprise. That’s right, they’re cheap and easy. Each of them are controlled by database level settings, and default to disabled when you install SQL Server.

The moral of the story:  both of these are awesome, both have potential drawbacks, and both need testing.  I’ve had good experiences with RCSI, but even then, maybe about 1% of procedures need specific locking hints (either NOLOCK or an explicit lock) to maintain previous application behavior and to deal with the problem Kendra brought up.  Moral of the story:  test, test, test.

Comments closed

DR On The Cheap

Derik Hammer’s final Availability Group architecture post covers disaster recovery on the cheap:

This architecture can be used when your organization does not value their secondary data center the same as the primary. It is a best practice to have matched or similar hardware between your primary and disaster recovery sites but that is not always possible. When costs need to be reduced it is better to have one failover server that you know can handle the work load rather than two servers which are under powered. Under powered hardware can easily become an effective outage if they cause timeouts as soon as a production work load is placed on them.

For many small to medium-sized enterprises, this might be the easiest route to sell to management—it’s hard to get management to have “redundant” servers which normally don’t get used.

Comments closed