Press "Enter" to skip to content

Author: Kevin Feasel

Trickle Migration

Richie Lee encountered a use case for trickle migration:

Recently I needed to apply compression data on a particularly large table. One of the main reasons for applying compression was because the database was extremely low on space, in both the data and the log files. To make matters worse, the data and log files were nowhere near big enough to accommodate compressing the entire table in one go. If the able was partitioned then I could have done one partition at a time and all my problems would go away. No such luck.

Best way to eat an elephant, etc. etc.  Read the whole thing; you might be in a similar situation someday.

Comments closed

Use Parentheses Wisely

Jen McCown plays around with the AND and OR operators:

Specifically, how is it evaluated when your where clause says “WHERE This AND That OR Something AND that”, without any clarifying parenthesis?

Let’s play around with this. The simplest test scenario is a SELECT 1. If I get a 1 back, that means my WHERE clause evaluated to true, right? Right.

Parentheses should clarify statements.  If I see an “AND” and an “OR” in a WHERE clause, I want to see parentheses, even if you’ve gotten it right.  It’s too easy to misinterpret precedence.

1 Comment

Think About Installation

Dave Mason implores us to think of the installation:

Every tsql command in your SQL script(s) has the potential to fail. It’s important to catch and handle tsql errors so that they don’t cause the entire installation to fail. This will require a lot of defensive, resilient, fault-tolerant coding on your part. Here’s an example for creating the database. Note the emphasis on permissions, which I touched on in another post.

This is important advice if you send installation scripts to customers (even if you’re using a packager to generate an install EXE).

1 Comment

Capturing Blocking Information

Erin Stellato shows us how to capture details when processes are blocked:

To view the output from extended events you can open the .xel file in Management Studio or query the data using the sys.fn_xe_file_target_read_file function. I typically prefer the UI, but there’s currently no great way to copy the blocking report text and view it in the format you’re used to.  But if you use the function to read and parse the XML from the file, you can…

If you can’t buy a tool which monitors long-term blocking, you can still build it yourself pretty easily.

Comments closed

Asking The Right Question

Buck Woody argues that the hardest thing about data science is asking the right question:

When I started down the path of learning Data Science, I was nervous. I have to work hard at math – it’s a skill I love but one that does not come naturally to me. I was nervous because I thought the most daunting task I would face in Data Science waslearning all the algebra, statistics, and other maths I would need to do the job.

But I was wrong.

Math isn’t the hardest thing in Data Science. Actually, since it’s so mature, and documented, and well-known, it’s quite possibly the easiest thing to conquer in the skillset. No, the hardest thing about Data Science is asking the right question.

I’ll lodge a bit of a disagreement here.  I’m okay with the argument that asking the right question is the toughest part, but the math’s not particularly easy either…  Knowing when to use which distribution, which model, and which parameters requires a definite amount of skill.

Comments closed

Multi-Tab Reports With R And jQuery

Matt Parker shows us how to create multi-tab reports using jQuery UI and R data:

But R is also part of an entire ecosystem of open tools that can be linked together. For example, Markdown, Pandoc, and knitr combine to make R an incredible tool for dynamic reporting and reproducible research. If your chosen output format is HTML, you’ve linked into yet another open ecosystem with countless further extensions.

One of those extensions – and the focus of this post – is jQuery UI. jQuery UI makes a set of JavaScript’s most useful moves available to developers as a robust, easy-to-implement toolkit ideal for adding a bit of interactivity to your knitr reports.

Generating a page from R is one of those good ideas that I probably don’t want to see in a production environment.

Comments closed

Conformity In Self-Service BI

Paul Turley has a nice post on some of the risks of self-service BI:

In some solutions with a manageable scale and a reasonable tolerance for a certain amount of data loss and inconsistency, this approach may be just fine.  There are very good reasons for inconsistencies between sets of data that come from different sources.  When 20 million fact rows are obtained from an online ordering system and .1% don’t have matching customers records that come from the system used to manage the master customer list, it may simply be a question of timing.  Or, maybe the definition of “customer” is slightly different in the two systems.  Chances are that there are also legitimate data quality issues affecting a small percentage of these records.

Whatever the case may be, a data conformity or potential data quality issue in this design scenario falls on the model designer to either fix or ignore.  Again, this may or may not be a concern but it is a decision point and might raise the question of ownership when questions about data quality are raised.

Paul then goes on to show how this gets fixed in a traditional model and where you need to watch out with SSAS Tabular.  Good essay worth reading.

Comments closed

Visualizing R In Power BI (Too)

Dustin Ryan is also looking at R visualization in Power BI:

Not only can we create and download custom visuals from PowerBI.com to extend the capabilities of Power BI, we can use R to create a ridiculous amount of powerful visualizations. If you can get the data into Power BI, you can use R to perform interesting statistical analysis and create some pretty cool, interactive visuals.

Dustin and Jan Mulkens are working on similar posts at the same time, so watch both of them.

Comments closed

Trace Flags Without Sysadmin

Jack Li shows how to enable a trace flag without sysadmin or changing any application code:

The initial thought is to enable the trace flag at session level.  We ran into two challenges.  First, application needs code change (which they couldn’t do) to enable it.  Secondly, dbcc traceon requires sysadmin rights.   Customer’s application used a non-sysadmin user.  These two restrictions made it seem impossible to use the trace flag.

However, we eventually came up with a way of using logon trigger coupled with wrapping the dbcc traceon command inside a stored procedure.   In doing so, we solved all problems.  We were able to isolate the trace flag just to that application without requiring sysadmin login.

This is the very edge of an edge case.  In normal practice, change the code.

Comments closed