Apache Kudu

Kevin Feasel



Greg Rahn discusses Apache Kudu:

At a high level, Kudu is a new storage manager that enables durable single-record inserts, updates, and deletes, as well as fast and efficient columnar scans due to its in-memory row format and on-disk columnar format.  This architecture makes Kudu very attractive for data that arrives as a single record at a time or that may need to be modified at a later time.

Today, many users try to solve this challenge via a Lambda architecture, which presents inherent challenges by requiring different code bases and storage for the necessary batch and real-time components. Using Kudu and Impala together completely avoids this problematic complexity by easily and immediately making data inserted into Kudu available for querying and analytics via Impala. (For more technical details on how Impala and Kudu work together for analytical workloads, see this post.)

I’d jokingly say “Someday, somebody’s going to reinvent the relational database inside of Hadoop.”  But it seems like that’s less of a joke than a medium-term prediction.

TDE With Database Mirroring

I have a post on setting up database mirroring when the underlying database uses Transparent Data Encryption:

 Now it’s time to take some backups. First, let’s back up the various keys and certificates:

USE [master]
--Back up the service master key
--Note that the password here is the FILE password and not the KEY password!
BACKUP SERVICE MASTER KEY TO FILE = 'C:\Temp\ServiceMasterKey.key' ENCRYPTION BY PASSWORD = 'Service Master Key Password';
--Back up the database master key
--Again, the password here is the FILE password and not the KEY password.
BACKUP MASTER KEY TO FILE = 'C:\Temp\DatabaseMasterKey.key' ENCRYPTION BY PASSWORD = 'Database Master Key Password';
--Back up the TDE certificate we created.
--We could create a private key with password here as well.
BACKUP CERTIFICATE [TDECertificate] TO FILE = 'C:\Temp\TDECertificate.cert'
    WITH PRIVATE KEY (FILE = 'C:\Temp\TDECertificatePrivateKey.key', ENCRYPTION BY PASSWORD = 'Some Private Key Password');

Click through for the details.

SSAS And Power BI Performance Issue

Chris Webb describes an issue with SSAS Multidimensional and Power BI-generated DAX causing a performance problem:

This query has something in it – I don’t know what – that means that it cannot make use of the Analysis Services Storage Engine cache. Every time you run it SSAS will go to disk, read the data that it needs and then aggregate it, which means you’ll get cold-cache performance all the time. On a big cube this can be a big problem. This is very similar to problems I’ve seen with MDX queries on Multidimensional and which I blogged about here; it’s the first time I’ve seen this happen with a DAX query though. I suspect a lot of people using Power BI on SSAS Multidimensional will have this problem without realising it.

This problem does not occur for all tables – as far as I can see it only happens with tables that have a large number of rows and two or more hierarchies in. The easy way to check whether you have this problem is to refresh your report, run a Profiler trace that includes the Progress Report Begin/End and Query Subcube Verbose events (and any others you find useful) and then refresh the report again by pressing the Refresh button in Power BI Desktop without changing it at all. In your trace, if you see any of the Progress Report events appear when that second refresh happens, as well as Query Subcube Verbose events with an Event Subclass of Non-cache data, then you know that the Storage Engine cache is not being used.

This doesn’t look to be a quick fix, so do read the whole thing to help figure out how to avoid this issue.

HBase Transactions

Kevin Feasel



George Leopold describes Omid:

The transaction manager utilizes a lock-free approach to support multiple clients and relies on a centralized conflict detection component to resolve write-set collisions among concurrent transactions. Developers added that Omid requires no modifications to the underlying HBase key-value data store.

It also features a simplified API that mimics transaction manager APIs in relational databases. Client and server configuration processes also were simplified to help both application developers and system administrators.

Filing this one under the “What’s old is new again” category.

Query Store Isn’t A Forensics Engine

Grant Fritchey shows that Query Store has a limited capability of finding “ill-behaving” queries at a point in time:

Here’s a great question I received: We had a problem at 9:02 AM this morning, but we’re not sure what happened. Can Query Store tell us?

My first blush response is, no. Not really. Query Store keeps aggregate performance metrics about the queries on the database where Query Store is enabled. Aggregation means that we can’t tell you what happened with an individual call at 9:02 AM…

Well, not entirely true.

Query Store isn’t a total solution for “Why was the system slow at XX:XX?” types of questions.  This does not diminish its value as long as you do not try to treat it as your only monitoring solution.

Slicer Filter Workaround

Reza Rad has a workaround for cases in which you want to filter a Power BI slicer:

The idea of this blog post came from a question that one of students in my Power BI course asked to me, and I’ve found this as a high demand in internet as well. So I’ve decided to write about it.

You might have too many items to show in a slicer. a slicer for customer name when you have 10,000 customers isn’t meaningful! You might be only interested in top 20 customers. Or you might want to pick few items to show in the slicer. With all other visual types (Such as Bar chart, Column chart, line chart….) you can simply define a visual level filter on the chart itself. Unfortunately this feature isn’t supported at the time of writing this post for Slicers. However the demand for this feature is already high! you can see the idea published here in Power BI user voice, so feel free to vote for such feature :)

As Reza notes, this might get resolved fairly soon.  Until then, check out his solution.

Don’t Use Double Dot

Chris Bell warns against using double dot syntax:

I am finding more and more cases where SQL code is being created using the double dot or period for the 2 part naming convention.

For example, instead of using dbo.table1 I am seeing ..table1.

I don’t know who suggested this in the first place, but it is not a good idea. Sure it works and does what you expect, but there is a HUGE risk with doing this. When you use the .. syntax, you are telling the code to use whatever the default schema is for the user that is running the query. By default that is the dbo schema, but there is no guarantee that all systems are going to be that way.

Read on to understand why this is a big deal.

Transaction Names Are Case Sensitive

Clive Strong notes that transaction names in SQL Server are case sensitive:

I had an issue today running a colleague’s code (the rollback and commit were commented out, but that is another story). The code failed and I tried to rollback the transaction but received this error message;

Msg 6401, Level 16, State 1, Line 5
Cannot roll back t1. No transaction or savepoint of that name was found.

I can’t remember the last time I named a transaction, but if you are in that habit, it’s important to remember.

SSRS Express And Azure Limitations

William Assaf points out that SQL Server Reporting Services Express Edition cannot connect to Azure SQL Database:

Express editions of SQL Server Reporting Service, from SQL 2016 on down, cannot connect to Azure SQL Databases. Turns out, getting something for free does have some significant limitations.

For example, you’ll see an error message “The Report Server has encountered a configuration error” on a data source page, when creating a new SSRS data source in the Report Manager website. What you may have not noticed on this page was the possible values in the Data Source Type drop down list.

This is an important limitation if you were thinking of living on the free tier of SSRS.


September 2016
« Aug Oct »