Press "Enter" to skip to content

Month: September 2016

HBase Transactions

George Leopold describes Omid:

The transaction manager utilizes a lock-free approach to support multiple clients and relies on a centralized conflict detection component to resolve write-set collisions among concurrent transactions. Developers added that Omid requires no modifications to the underlying HBase key-value data store.

It also features a simplified API that mimics transaction manager APIs in relational databases. Client and server configuration processes also were simplified to help both application developers and system administrators.

Filing this one under the “What’s old is new again” category.

Comments closed

Query Store Isn’t A Forensics Engine

Grant Fritchey shows that Query Store has a limited capability of finding “ill-behaving” queries at a point in time:

Here’s a great question I received: We had a problem at 9:02 AM this morning, but we’re not sure what happened. Can Query Store tell us?

My first blush response is, no. Not really. Query Store keeps aggregate performance metrics about the queries on the database where Query Store is enabled. Aggregation means that we can’t tell you what happened with an individual call at 9:02 AM…

Well, not entirely true.

Query Store isn’t a total solution for “Why was the system slow at XX:XX?” types of questions.  This does not diminish its value as long as you do not try to treat it as your only monitoring solution.

Comments closed

Slicer Filter Workaround

Reza Rad has a workaround for cases in which you want to filter a Power BI slicer:

The idea of this blog post came from a question that one of students in my Power BI course asked to me, and I’ve found this as a high demand in internet as well. So I’ve decided to write about it.

You might have too many items to show in a slicer. a slicer for customer name when you have 10,000 customers isn’t meaningful! You might be only interested in top 20 customers. Or you might want to pick few items to show in the slicer. With all other visual types (Such as Bar chart, Column chart, line chart….) you can simply define a visual level filter on the chart itself. Unfortunately this feature isn’t supported at the time of writing this post for Slicers. However the demand for this feature is already high! you can see the idea published here in Power BI user voice, so feel free to vote for such feature :)

As Reza notes, this might get resolved fairly soon.  Until then, check out his solution.

Comments closed

Don’t Use Double Dot

Chris Bell warns against using double dot syntax:

I am finding more and more cases where SQL code is being created using the double dot or period for the 2 part naming convention.

For example, instead of using dbo.table1 I am seeing ..table1.

I don’t know who suggested this in the first place, but it is not a good idea. Sure it works and does what you expect, but there is a HUGE risk with doing this. When you use the .. syntax, you are telling the code to use whatever the default schema is for the user that is running the query. By default that is the dbo schema, but there is no guarantee that all systems are going to be that way.

Read on to understand why this is a big deal.

Comments closed

Transaction Names Are Case Sensitive

Clive Strong notes that transaction names in SQL Server are case sensitive:

I had an issue today running a colleague’s code (the rollback and commit were commented out, but that is another story). The code failed and I tried to rollback the transaction but received this error message;

Msg 6401, Level 16, State 1, Line 5
Cannot roll back t1. No transaction or savepoint of that name was found.

I can’t remember the last time I named a transaction, but if you are in that habit, it’s important to remember.

Comments closed

SSRS Express And Azure Limitations

William Assaf points out that SQL Server Reporting Services Express Edition cannot connect to Azure SQL Database:

Express editions of SQL Server Reporting Service, from SQL 2016 on down, cannot connect to Azure SQL Databases. Turns out, getting something for free does have some significant limitations.

For example, you’ll see an error message “The Report Server has encountered a configuration error” on a data source page, when creating a new SSRS data source in the Report Manager website. What you may have not noticed on this page was the possible values in the Data Source Type drop down list.

This is an important limitation if you were thinking of living on the free tier of SSRS.

Comments closed

Against Visual Programming Languages

Ian Hellstrom has a critique of visual programming languages for data engineers:

Anyone with a software development background who has ever dealt with visual ETL tools may have marvelled at the lack of proper version control and diff tools that go with it. Some tools come with their own built-in VCS, while others allow you to use any or no VCS at all. The difficulty lies in the fact that the visual representation is often stored as an XML (or JSON) file. So, if a box is moved by 1 pixel, the file is different. You could argue that it’s indeed different because the layout is different, but you could equally make the case that the logic has not changed. This argument is moot though: it is technically possible to ensure that the tool auto-aligns blocks and routes/colours arrows, very much like yEd does (via menu items). Some users may not be happy with the reduced control over the way the flow looks, but others may rejoice that version control has become usable.

ETL (and ORM) tools often auto-generate code that is not particularly tuned for the data source in question. I have encountered many odd nested loops where simple hash joins would have been more appropriate if only the predicates had been pushed down properly (and if only the tool had evaluated blocks lazily). Aggregations and timestamp-based filters are also often a cause for performance issues. Again, performance is technically solvable, so this may be a valid argument against visual tools in data engineering now but perhaps not tomorrow.

This is a good argument against VPLs, although there are a couple of good arguments for VPLs, including how it’s easier to see if the overall architecture of a flow looks correct.  In the end, I like the compromise that Biml offers Integration Services developers:  write code but visualize results.

Comments closed

Comparing Impala To Redshift

Mostafa Mokhtar, et al, have a comparison of Apache Impala to Amazon Redshift:

For this analysis, we used TPC-DS on a 3TB dataset and selected 70 out of 99 the queries that run without any modifications or uses variants on both Redshift and Impala. We wanted to use a larger dataset (similar to what we’ve used in previous benchmarks), but due to Redshift’s data load times, we had to reduce the data size. (Note: This benchmark is derived from the TPC-DS benchmark and, as such, is not directly comparable to published TPC-DS results.)

This is coming from one of the two vendors, so take it with however many grains of salt you’d like.

Comments closed

SSISDB Management

Andy Leondard describes steps you can take to maintain the SSIS Catalog:

Back It Up

As with all SQL Server database, please back up SSISDB. What follows is a (very) basic guide describing one simple method to backup your SSISDB database. Please, please, please learn more about SQL Server backup and restore options and their implications before backing up an SSISDB database in your enterprise. Feel free to use the steps I describe on your laptop or a virtual machine. And please remember…

Backups are useless. Restores are priceless. Conduct practice Disaster Recovery exercises in which you restore databases and then test functionality. You’ll be glad you did. Here is a link containing Microsoft’s advice on restoring the SSISDB database in SQL Server 2016.

The advice is pretty similar to what you’d expect for any other database, but there are a couple twists around SSISDB functionality, so do read on.

Comments closed

Running Powershell In VS Code

Max Trinidad finishes his series on Powershell in Visual Studio Code.  Here’s part 2:

VS Code – Code Runner Extensions

We need to proceed to install the “Code Runner” Extension. Take a look at this extension information which can be use with many other script languages.

And here’s part 3:

VS Code – Terminal session

In Windows, we are configuring the VS Code “Integrated Terminal” to instead of executing Windows Cmd shell or Linux Bash, to use PowerShell Console.Then again, this is a quick change in the user “settings.json” in your script working folder.

It’s still amazing how far Microsoft has come from the Ballmer days.

Comments closed