Performance Tuning A Streaming Application

Mathieu Dumoulin explains how he was able to get 10x performance out of a streaming application built around Kafka, Spark Streaming, and Apache Ignite:

The main issues for these applications were caused by trying to run a development system’s code, tested on AWS instances on a physical, on-premise cluster running on real data. The original developer was never given access to the production cluster or the real data.

Apache Ignite was a huge source of problems, principally because it is such a new project that nobody had any real experience with it and also because it is not a very mature project yet.

I found this article fascinating, particularly because the answer was a lot more than just “throw some more hardware at the problem.”

Understanding SSIS Catalog References

Andy Leonard has a detailed post on setting up SSIS catalog environments and references:

Why This Complexity?

My argument for embracing (or ignoring) the complexity of SSIS Catalog configuration is straightforward: Externalization – the act of storing execution-time values outside of the object to be executed – is one way to achieve portability, a software development best practice.

As an architect I loathe complexity. Whenever I encounter (or opt to develop) complexity in a software project, I question it. I ask others to question it. I don’t always arrive at the right conclusion, but I challenge complexity at every turn.

I’ve helped several organizations implement SSIS Catalog portability using Environments and References. We all agree that the solution is complex, but most also agree that the complexity is worth the flexibility achieved thereby.

This is a fairly lengthy blog post, full of good information.

Getting The Decimal Part Of A Number

Kevin Feasel

2017-01-24

T-SQL

Madhivanan JR shows two methods for extracting the decimal value of a number:

One of the questions asked in a forum.  “What are the different methods to extract decimal part from a number?”.

There can be many methods. Two simple methods are as shown below

Click through for the two methods.  I knew method #1 but didn’t think about method #2 because I’ve always thought of the modulo with respect to integers, but it makes sense.

Installing DBATools Without Powershell 5

Steve Jones has to install dbatools without the benefit of Powershell 5.0:

I have a number of Windows Server 2012 R2 VMs. I need to get rid of these and upgrade to WS2016, since installing SQL 2016 is a pain on these VMs. However, for the time being, I’m stuck with them.

I was looking to try some of the dbatools modules. I popped open an elevated command prompt and [error]

This is because that module isn’t in this version of PoSh. I checked, and I could see this is v4, and Install-Module was added in v5.

Click through for the solution.

Excluding Schema Comparison Sections

Angela Henry shows how to exclude entire sections of a schema comparison in Visual Studio:

When using VS for database projects I typically use my environment specific Publish xml file to deploy changes to my local database when experimenting with code changes.  However, every once in a while I will have to use the New Schema Compare tool from the Tools | SQL Server menu when I have a “one-off” database that I need to synch to my database project.

Quite frequently because these are one-off databases there will be a ton of junk items that are in the database, but not in my database project or lots of objects in my project that aren’t in my database.  Either way I want to ignore those changes.  In the past, I’ve always manually unchecked each item, tedious when you have more than two items to uncheck.

Click through for more details.

Extending Nested Sets

Nate Johnson extends the nested sets model to include a depth attribute:

Depth is pretty simple to add if you’ve already got a tree full of data.  We can use a recursive common table expression, or “rCTE“.  While normally these are frown-worthy (remember, recursion is not SQL’s strong suite), we’re only using it one time to populate an existing data-set, so we can keep on smiling.

Hierarchies in SQL are an important but not well understood topic.

The Cost Of Compression

Erin Stellato looks at the COMPRESS function:

While this data is interesting, I’m more curious about how compression affects my everyday SELECT queries. I have a set of three stored procedures that each have one SELECT query, so that each index is used. I created these procedures for each table, and then wrote a script to pull values for first and last names to use for testing. Here is the script to create the procedures.

Once we have the stored procedures created, we can run the script below to call them. Kick this off and then wait a couple minutes…

To me, the COMPRESS function is most useful for compressing information you tend not to search through but need to keep the in the database, like HTML markup or long descriptions.

Query Plan Attributes

Grant Fritchey explains the sys.dm_exec_plan_attributes DMV:

There is a DMV that isn’t used a lot of the time because the information within it frequently doesn’t have a lot of bearing on solving fundamental query tuning issues such as out of date statistics, bad or missing indexes, or poorly structured T-SQL. This DMV, sys.dm_exec_plan_attributes, contains a bunch of values that are used by the optimizer to identify a plan in cache, such as object_id (if any), database_id, and compatibility level (compat_level). In addition to these clear & easy to understand attributes, there’s one more, set_options, that’s not immediately clear.

Read on for more information and a sample call.

Categories

January 2017
MTWTFSS
« Dec Feb »
 1
2345678
9101112131415
16171819202122
23242526272829
3031