Press "Enter" to skip to content

Month: May 2017

Power BI Report Server

The Power BI team knows how to make me happy:

Today Microsoft announced Power BI Premium — a capacity-based licensing model that increases flexibility for how users access, share and distribute content in Power BI. The new offering also introduces the ability to manage Power BI Reports on-premises with the included Power BI Report Server.

Power BI Report Server will be generally available late in the second quarter of 2017.

I like this a lot for internal company dashboards.

Comments closed

Database Role Members

Nate Johnson has a quick demo showing how to add users to database roles:

The “old” method involves calling system stored-procedures, sp_addrolemember and sp_droprolemember, in which you pass the role-name and username.  The “new” method, supported starting with SQL 2012, is to use the command-phrases ALTER ROLE [role] ADD MEMBER [user], and ALTER ROLE [role] DROP MEMBER [user].

Nate shows both methods, so check it out.

Comments closed

SQL As A Limiting Agent

Bert Wagner has advice for application developers:

Basically, if you are running code similar to above, the reason your job is slow is because you are not optimizing where your work is being performed:

  • Every time you write SELECT * you probably are bringing back more data than you actually need — you are hurting your performance.

  • Every time you don’t have a WHERE clause, you are hurting your performance.

  • Every time your process queries the database multiple times (ie. multiple SELECT statements in your job to bring back data), you are hurting your performance.

It’s nothing new for data professionals, but for application developers who avoid the database as much as possible due to a lack of knowledge, this might be a good wake-up call.

Comments closed

Temp Table Caching

Paul White explains how to cache temporary objects:

Table variables and local temporary tables are both capable of being cached. To qualify for caching, a local temporary table or table variable must be created in a module:

  • Stored procedure (including a temporary stored procedure)
  • Trigger
  • Multi-statement table-valued function
  • Scalar user-defined function

The return value of a multi-statement table-valued function is a table variable, which may itself be cached. Table-valued parameters (which are also table variables) can be cached when the parameter is sent from a client application, for example in .NET code using SqlDbType.Structured. When the statement is parameterized, table-valued parameter structures can only be cached on SQL Server 2012 or later.

The first time I heard about this was a SQL Saturday presentation that Eddie Wuerch did.  Paul does a great job talking about the requirements (and noting that table variables are eligible as well), making this well worth the time to read.

Comments closed

Advice For A Budding Data Scientist

Charles Parker riffs off of an Edsger Dijkstra note:

It’s still early days for machine learning. The bounds and guidelines about what is possible or likely are still unknown in a lot of places, and bigger projects that test more of those limitations are more likely to fail. As a fledgling data engineer, especially in the industry, it’s almost certainly the more prudent course to go for the “low-hanging fruit” — easy-to-find optimizations that have real world impact for your organization. This is the way to build trust among skeptical colleagues and also the way to figure out where those boundaries are, both for the field and for yourself.

As a personal example, I was once on a project where we worked with failure data from large machines with many components. The obvious and difficult problem was to use regression analysis to predict the time to failure for a given part. I had some success with this, but nothing that ever made it to production. However, a simple clustering analysis that grouped machines by the frequency of replacement for all parts had some lasting impact; this enabled the organization to “red flag” machines that fell into “high replacement” group where the users may have been misusing the machines and bring these users in for training.

There’s some good advice.  Also read the linked Dijkstra note; even in bullet point form, he was a brilliant guy.

Comments closed

Using Talend To Build Shape Files

Paul Hernandez has a demo where he uses Talend’s product to convert latitude and longitude pairs to a shape file:

Input data

Customers coordinates: a flat file containing x,y coordinates for every customer.

Municipalities in Austria: a shape file with multi-polygons defining the municipalities areas in Austria: source

Goal

The goal was to “look-up” the coordinates in the shape file in order to get the municipality code GKZ which in german stand for “Gemeindekennzahl”.

Check out the demo.

Comments closed

One Surefire Guarantee For Slow Performance

Patrick Keisler troubleshoots an issue where the buffer pool gets flushed each morning:

What is the McShield service? A quick Bing search revealed that it’s one of the services for McAfee VirusScan Enterprise. Could this be the cause? To get a quick look at all the history, I filtered the application log for event IDs: 17890 and 5000. Each time McAfee got an updated virus DAT file, SQL Server soon followed that by paging out the entire buffer pool. I checked the application log on several other SQL Servers for the same event IDs, and sure enough the same events occurred in tandem each morning. I also got confirmation from the security administration team that McAfee is scheduled to check for new DAT files each morning around 8AM. Eureka!

This seems like it could be the cause of our paging, but a little more research is needed. Searching the McAfee knowledge base, lead me to this article about the “Processes on enable” option.

Enabling this option causes memory pages of running processes to get paged to disk. And the example given is “Oracle, SQL, or other critical applications that need to be memory-resident continually, will have their process address space paged to disk when scan Processes On Enable kicks in”. OUCH! So when the McAfee service starts up or it gets a new DAT file, it will page out all processes.

Fortunately, this is a setting you can turn off, and Patrick shows how.

Comments closed

Building A SQL Server 2014 Container

Andrew Pruski shows how to create a Docker container with SQL Server 2014 SP2 installed:

I’m running all of this on my Windows 10 machine but there are a few things you’ll need before we get started: –

Pre-requisites

  • The microsoft/windowsservercore image downloaded from the Docker Hub

  • Windows Server 2016 installation media extracted toC:\Docker\Builds\Windows

  • SQL Server 2014 SP2 Developer Edition installation media extracted to C:\Docker\Builds\SQLServer2014

Or if you prefer, you can just pull his image, but where’s the fun in that?

Comments closed

DatasauRus Lives

Steph Locke shows how to create a package in R:

Then we need to add github repository to our project. I use the git command line for this:

git remote add origin git@github.com:stephlocke/datasauRus.git
git push --set-upstream origin master

With just these things, I have a package that contains the unit test framework, documentation stubs, continuous integration and test coverage, and source control.

That is all you need to do to get things going!

This is great timing for me, as I’m starting to look at packaging internal code.  Also, it’s great timing because it includes dinosaurs.

Comments closed