Press "Enter" to skip to content

Month: December 2016

Backup Up Analysis Services

Jens Vestergaard shows how to take backups of Analysis Services cubes:

I have not met a setup where applying compression was not an option, yet. Obviously this has a penalty cost on CPU while executing the backup, and will affect the rest of the tasks running on the server (even if you have your data and backup dir on different drives). But in my experience, the impact is negligible.

This may not be the case with the encryption option, as this has a much larger foot print on the server. You should be using this with some caution in production. Test on smaller subsets of the data if in doubt.
Another thing to keep in mind, as always when dealing with encryption, do remember the password. There is no way of retrieving the data other than with the proper password.

My goal is to be able to rebuild any cube from the relational database, but even with that goal in mind, it is smart to have backups.

Comments closed

When Was That Index Modified?

Kendra Little looks at index creation and modification dates:

SQL Server doesn’t really track index create or modification date by default

I say “really”, because SQL Server’s default trace captures things like index create and alter commands. However, the default trace rolls over pretty quickly on most active servers, and it’s rare that you’re looking up the creation date for an index you created five minutes ago.

I think it’s fine that SQL Server doesn’t permanently store the creation date and modification date for most indexes, because not everyone wants this information — so why not make the default as lightweight as possible?

That said, Kendra has several methods for answering the question of when a particular index was created.

Comments closed

SQL As A Declarative Language

Lukas Eder discusses one benefit to a declarative language like SQL:

It’s simple. Both the set-builder notation, and the SQL language (and in principle, other languages’ for comprehensions) are declarative. They are expressions, which can be composed to other, more complex expressions, without necessarily executing them.

Remember the imperative approach? We tell the machine exactly what to do:

  • Start counting from this particular minimal integer value
  • Stop counting at this particular maximal integer value
  • Store all even integers in between in this particular intermediate collection

What if we don’t actually need negative integers? What if we just wanted to have a utility that calculates even integers and then reuse that to list all positive integers? Or, all positive integers less than 100? Etc.

It may be my innate contrarian curmudgeonliness, but I am moving more and more toward the idea that the easiest way to deal with data is a combination of SQL and functional programming languages, leaving OO out of the picture.

Comments closed

SQL Agent Alerts

David Alcock has a script to create SQL Agent alerts for common errors:

These alerts cover a range of errors from potential IO subsystem problems to failed logins, all of which are things a DBA needs to know about, and quickly too.
As well as error notifications you can set up alerts to cover performance conditions. The final statement in the script below sets up an alert that triggers when Page Life Expectancy drops below 1000. In all honesty I don’t set up these performance alerts that often but I wanted to show you the kind of thing that is possible and would be handy if you don’t have any third party monitoring.

He follows this up with a post on appropriate response:

But what do I mean by sensible? Typically I see a number of problems with alerting setups; either alerts are inadequate and don’t cover the necessary errors (or there are none at all) but I also see the notifications to alerts not being set up correctly meaning problems go backwards and forwards delaying any fixes.
The other problem I see is an over provision of alerts. This usually is because one or more other monitoring systems have been deployed and error notifications have been duplicated as a result. Imagine having an operational tool like System Centre, some SQL monitoring software and native alerting all pinging the same message to the one recipient mailbox. Now on top of that let’s say the alerts have not been configured correctly so information emails are being issued every second. It’s a scary thought but it is easy to see how a critical error might be missed in this scenario.

If you don’t have automatic alerts for high-severity errors, this is an easy way of gaining insight into the problems your server is experiencing.

Comments closed

Ring Buffers

Juho Snellman explains ring buffers:

This is of course not a new invention. The earliest instance I could find with a bit of searching was from 2004, with Andrew Morton mentioning in it a code review so casually that it seems to have been a well established trick. But the vast majority of implementations I looked at do not do this.

So here’s the question: Why do people use the version that’s inferior and more complicated? I’ve must have written a dozen ring buffers over the years, and before being forced to really think about it, I’d always just used the first definition. I can understand why a textbook wouldn’t take advantage of unsigned integer wraparound. But it seems like it should be exactly the kind of cleverness that hackers would relish using and passing on.

Check out the comments for more information, a bit of code golf, and multiple links on tying shoelaces.

Comments closed

Analyzing Taxi Data With Microsoft R Server

Ali Zaidi builds a Spark cluster to analyze 1.1 billion taxi cab rides using Microsoft R Server:

In a similar spirit to how sparklyr allowed us to reuse our functions from the dplyr package to manipulate Spark DataFrames, the RxSpark API allows a data scientist to develop code that can be deployed in a multitude of environments. This allows the developer to shift their focus from writing code that’s specific to a certain environment, and instead focus on the complex analysis of their data science problem. We call this flexibility Write Once, Deploy Anywhere, or WODA for the acronym lovers.

For a deeper dive into the RevoScaleR package, I recommend you take a look at the online course, Analyzing Big Data with Microsoft R Server. Much of this blogpost follows along the last section of the course, on deployment to Spark.

R isn’t just for small, one-off jobs anymore.

Comments closed

Transaction Log Operations And Backups

John Deardurff explains what happens in the transaction log when you restore a backup:

In the example, the database performed a checkpoint at noon and a backup had been taken at that time. The restore process will capture all the transactions up until the point the database had been backed up. After the database has been restored, the recovery process will roll forward transactions 2 and 4 because they had been committed to the transaction log before the point of failure. Since transactions 3 and 5 did not commit before the time of system failure, the undo process will roll back the transactions to keep the data in a consistent state.

Read the whole thing.

Comments closed

Streaming Data With Kinesis

Asaaf Mentzer shows how to join streaming data (specifically, AWS Kinesis) with lookup data:

In this use case, Amazon Kinesis Analytics can be used to define a reference data input on S3, and use S3 for enriching a streaming data source.

For example, bike share systems around the world can publish data files about available bikes and docks, at each station, in real time.  On bike-share system data feeds that follow the General Bikeshare Feed Specification (GBFS), there is a reference dataset that contains a static list of all stations, their capacities, and locations.

There are three different architectures in here, so if you’re looking for streaming data models with Kinesis (or want to apply them to Kafka), this is a solid read.

Comments closed

Dial Gauge

Devin Knight explains the dial gauge custom visual:

  • The effectiveness of gauges on dashboards is an often debated topic.

  • The Dial Gauge is completely data driven. Which means not only must your measure (drives the needle) come from a dataset but also the different thresholds ranges must come from your dataset too.

  • There are no specific Format settings for the Dial Gauge, which does limit you a bit with what you can do with this gauge.

There are certain scenarios in which I think the dial gauge works well.  The best scenario is the the same as its analog counterpart:  when you are measuring a single continuous variable with a safe range and meaningful range differences.  This scenario occurs less often than you might think.

Comments closed

Cannot Connect To WMI Provider

Andrew Peterson troubleshoots an error after installing SSMS vNext:

After installing SQL Server Management Studio for vNext, the Configuration Manager no longer opens, with a message similar to the following:

Cannot connect to WMI provider. You do not have permission or the server is unreachable. Note that you can only manage SQL Server 2005 and later servers with SQL Server Configuration Manager.
Invalid namespace [0x8004100e]

Read on for the solution.

Comments closed