Press "Enter" to skip to content

Day: February 6, 2019

On R Packages And Trust

Colin Gillespie shares some thoughts about the potentially over-trusting nature of R developers:

One of the great things about R, is the myriad of packages. Packages are typically installed via

– CRAN
– Bioconductor
– GitHub

But how often do we think about what we are installing? Do we pay attention or just install when something looks neat? Do we think about security or just take it that everything is secure? In this post, we conducted a little nefarious experiment to see if people pay attention to what they install.

Packages are code and like any other code, R packages can contain malicious content.

Comments closed

Misinterpretation and Misuse of P-Values and Confidence Intervals

Dave Giles has some good details on common problems of misinterpretation:

There are so many things in statistics (and hence in econometrics) that are easily, and frequently, misinterpreted. Two really obvious examples are p-values and confidence intervals.

I’ve devoted some space in earlier posts to each of these concepts, and their mis-use. For instance, in the case of p-values, see the posts here and here; and for confidence intervals, see here and here.

Click through for more in this vein, including a reference to an interesting-looking paper.

Comments closed

Monitoring SQL Server with Telegraf

I have a post up on monitoring SQL Server instances with Telegraf:

Not too long ago, I had the opportunity to put into place a free solution for monitoring SQL Server instances. I saw Tracy’s series on collecting performance metrics InfluxDB + Telegraf + Grafana, and then I saw her talk on the topic (Collecting Performance Metrics), but until I implemented it myself, I couldn’t believe how easy it was. I thought it was going to take two or three days of hard work to get done, but I had everything going within a few hours.

Let’s walk through the process together.

I keep saying this in the post, but it’s much easier than I expected. There are still more steps than a commercial off-the-shelf product but part of what you’re paying for there is convenience, so that had better be easier.

Comments closed

Formatting with RegEx in SQL Server

Shane O’Neill has a problem:

This is a contrived example but I was given a script that got the “Discipline”, “DocumentVersion”, “DocumentNumber”, “SectionNumber”, and “SectionName” out of the above.

And while it works, I hate that formatting. Everything is all squashed and shoved together.

No, thanks. Let’s see if we can make this more presentable.

Shane has a regular expression. Now Shane has two problems.

In all seriousness, regular expressions are extremely powerful in the right scenario. Shane mentions being okay with it not in the database engine and I’m usually alright with that, but there are cases when it’s really helpful like figuring out if a particular input is valid. One example I have on a project is finding legitimate codes (like ISBN) where you can solve the problem easily with a regex but my source data is abysmal. I can use the SQL# regular expression functions to drop into CLR and figure out whether that value is any good, something I would have a lot more trouble with in T-SQL alone.

Comments closed

An Overview of dbatools with Jess and Bert

Bert Wagner has a new video available:

dbatools is one of the coolest community projects I’ve seen – it is amazing how many commands are available to help make managing your SQL Server instances a breeze.

This week I had the opportunity to learn how to use dbatools to automate backups, change recovery models, and discover additional dbatools commands from dbatools contributor Jess Pomfret.

Jess Pomfret then goes into more detail on the commands in the video:

The final tip I had for Bert was how to use Find-DbaCommand to help him find the commands he needed to complete his tasks.

A lot of the commands have tags, which is a good way to find anything relating to compression.

That was a nice collaboration.

Comments closed

Monitoring Entity Framework

Grant Fritchey loves Entity Framework:

Yes, Entity Framework will improve your job quality and reduce stress in your life.

With one caveat, it gets used correctly.

That’s the hard part right? There is tons of technology that makes things better, if used correctly. There are all sorts of programs that make your life easier, if used correctly. Yet, all of these, used incorrectly, can make your life a hell.

One nit that I’ve always had with Entity Framework is that it’s very difficult to tell what part of the code the call was coming from. You really have no idea. So when my friend, Chris Woodruff, asked me on Twitter what would be the best way to monitor TagWith queries in Entity Framework, well, first, I had to go look up what TagWith was, then I got real excited, because, hey, here’s a solution.

That “I love Entity Framework” is the lead-in to a one-act play of mine with people with pitchforks, tar, and feathers. Nevertheless, Grant shows us how we can tag code in C# and capture that data in extended events. I’d read it but I’m too busy sharpening my pitchfork.

Comments closed

Automated Query Capture With Logic Apps

Arun Sirpal shows how we can use Azure Logic Apps to automate periodic capture of running queries in Azure SQL Database:

Have you ever wanted to capture the T-SQL, waits, sessions IDs (etc) at a specific time for Azure SQL Database? Sure there are a few ways to do this. Extended Events comes to mind but I wanted to do something different.

For this blog post I decided to use Brent Ozar’s famous sp_BlitzWho command (in expert mode) coupled with Azure Logic Apps. At a high level it is simple. At a specific time trigger the execution of sp_BlitzWho stored procedure and query it for later use.

Click through to see how to set this up.

Comments closed

Creating A Big Data Cluster

Chris Adkin continues a series on big data clusters in SQL Server 2019:

This post post will focus on creating a big data cluster so that you can get up and running as fast as possible, as such the storage type used will be ephemeral, this perfectly acceptable for “Kicking the tyres”. For production grade installations integration with a production grade storage platform is required via a storage plugin. Before we create our cluster, with the assumption we are doing this with an on premises infrastructure, the following pre-requisites need to be met:

Read the whole thing, but wait until part 4 before putting anything valuable in it.

Comments closed