Press "Enter" to skip to content

Author: Kevin Feasel

Using The Public Role

Kenneth Fisher explains the public role in SQL Server:

A common misunderstanding is that the CONNECT permission lets you do more than just connect to a database. It doesn’t. Connection only. So how come there are some things that everyone can do once they are connected to a database? Well, it’s the public role. Everyone is a member and that can’t be changed. In fact, you can’t even disable it. Oh, and I should point out that every database has one.

So what does that mean? If you have a table that you want everyone to have read access to you could grant the permission in public.

I never use the public role for anything, and so it’s a benign role.  I strongly dislike database security tools which flag the public role as a risk, mostly because I made the mistake once of believing the tool and had to start granting things like CONNECT to each new login.

Comments closed

Executing Dynamic SQL: EXEC vs sp_executesql

Tara Kizer gives a scenario where using sp_executesql can result in lower CPU utilization than EXEC:

For frequently executed queries, “EXEC sp_executesql @sql”is a good choice but at the risk of encountering parameter sniffing issues.

For queries that aren’t executed very often, “EXEC (@sql)” can be used, but you could end up with a bloated plan cache with lots of single-use execution plans.

Let’s compare the performance difference between “EXEC (@sql)” and “EXEC sp_executesql @sql” for a frequently executed query.

For application code, I tend to lean on sp_executesql very heavily to create parameterized queries.

Comments closed

Some Waits Just Need Ignoring

Paul Randal explains that not all SQL Server wait types are pernicious:

Wait statistics analysis is one of my favorite things to talk about because it’s so incredibly useful for performance tuning and can dramatically shorten the time it takes to zero in on the root cause of a performance problem. But you have to do it correctly. You can’t just do a SELECT * FROM sys.dm_os_wait_stats. Various people have published scripts online to aggregate and display wait statistics in an actionable way, and my script is one of the most popular (latest version is always in this post).

One question I’m often asked is why does my script have a list of wait types that it specifically filters out? The answer is that those wait types are what I call ‘benign’ – they’re usually not a problem but happen frequently enough from regular SQL Server operations that they would show up as the top waits and so would obscure the waits that you can do something about.

Read on for the rest of the story.

Comments closed

Scoping And Powershell Script Blocks

Shane O’Neill looks at variable scope within script blocks in Powershell:

Now what happens if we don’t want to use switch? What happens if we tried to do this with script blocks instead?

I’ve created 3 different script blocks here just for ease of use. If you want to create a single script block to do this then I encourage it! Let me know how you get on.

We’re also passing in the parameter into the script block by passing it in the brackets of the .InvokeReturnAsIs() method.

Read on and learn along with Shane.

Comments closed

VLFs: How Many Are Too Many?

Randolph West looks at a baseline for the maximum number of Virtual Log Files for a database:

In a transaction log with too many or too few VLFs we might experience performance issues under a normal workload, as well as during the backup and restore process.

So what is the “right” amount? In customer engagements, I follow a guideline proposed by Glenn Berry of SQLskills.com in his Diagnostic Information Queries, to keep the number of VLFs at or below 200. In my opinion, any number higher than that is cause for concern.

On the other hand, Brent Ozar Unlimited has a popular script called sp_Blitz which proposes a maximum VLF count of 1000. To Brent, a number higher than that is cause for concern.

I tend toward the lower number, but if you have a smoothly-functioning environment and some databases have 700 or 900 VLFs, I probably wouldn’t give it a second thought.

Comments closed

Tracking Deployment Details

Andy Leonard tells a story whose moral is that you need to keep track of what you deploy:

But this had to be done.
Right now.

I thanked Geoff and hung up the phone. I then made another judgment call and exercised yet more of my ETL Architect authority. I assigned the PrUAT ticket to myself, logged into PrUAT, executed the patch, copied the output of the execution to the Notes field of the ticket (as we’d trained all DBAs and Release Management people to do), and then manually verified the patch was, in fact, deployed to PrUAT.

I closed the ticket and called my boss. “Done. And verified,” I said. My boss replied, “Good,” and hung up. He passed the good news up the chain.

A funny thing happened the next morning. And by “funny,” I mean no-fun-at-all. My boss called and asked, “Andy? I thought you said the patch was was deployed to PrUAT.” I was a little stunned, grappling with the implications of the accusation. He continued, “The process failed again last night and vendor checks were – again – not cut.” I finally stammered, “Let me check on it and get back to you.”

It’s a good story and really sells the idea that you have to track deployment details, including when you’re doing manual deployments.

Comments closed

Combining Keras With Apache MXNet

Lai Wei, et al, show how to build a neural network in Keras 2 using MXNet as the engine:

Distributed training with Keras 2 and MXNet

This article shows how to install Keras-MXNet and demonstrates how to train a CNN and an RNN. If you tried distributed training with other deep learning engines before, you know that it can be tedious and difficult. Let us show you what it’s like now, with Keras-MXNet.

Installation is only a few steps

  1. Deploy an AWS Deep Learning AMI

  2. Install Keras-MXNet

  3. Configure Keras-MXNet

The Deep Learning AMI is already set up for trial, so it should be easy to follow along.

Comments closed

There Is No Easy Button With Predictive Analytics

Scott Mutchler dispels some myths:

There are a couple of myths that I see more an more these days.  Like many myths they seem plausible on the surface but experienced data scientist know that the reality is more nuanced (and sadly requires more work).

Myths:

  • Deep learning (or Cognitive Analytics) is an easy button.  You can throw massive amounts of data and the algorithm will deliver a near optimal model.
  • Big data is always better than small data.  More rows of data always results in a significantly better model than less rows of data.

Both of these myths lead some (lately it seems many) people to conclude that data scientist will eventually become superfluous.  With enough data and advanced algorithms maybe we don’t need these expensive data scientists…

Read on for a dismantling of these myths.  There’s a lot more than “collect all of the data and throw it at an algorithm” (and even then, “all” the data rarely really means all, which I think deserves to be a third myth).  H/T R-bloggers

Comments closed

Multi-Layered Security With Docker Containers

Jessie Frazelle points out the advancements in security that Docker has made over the past couple of years:

Container runtimes have security layers defined by Seccomp, Apparmor, kernel namespaces, cgroups, capabilities, and an unprivileged Linux user. All the layers don’t perfectly overlap, but a few do.

Let’s go over some of the ones that do overlap. I could do them all, but I would be here all day. The mount syscall is prevented by the default Apparmor profile, default Seccomp profile, and CAP_SYS_ADMIN. This is a neat example as it is literally three layers. Wow.

Everyone’s favorite thing to complain about in containers or to prove that they know something is creating a fork bomb. Well this is actually easily preventable. With the PID cgroup you can set a max number of processes per container.

Interesting reading from an insider.

Comments closed

The Semantics Of GraphQL

Adrian Colyer reviews a paper on the mathematical properties behind GraphQL:

The authors study the computational complexity of GraphQL looking at three central questions:

  1. The evaluation problem: what can we say about the complexity of GraphQL query evaluation?
  2. The enumeration problem: how efficiently can we enumerate the results of a query in practice?
  3. The response size problem: how large can responses get, and what can we do to avoid obtaining overly large response objects?

In order to do this, they need to find some solid ground to use for reasoning. So the starting point is a formalisation of the semantics of GraphQL.

This is a review of a published academic paper rather than a how-to guide, so it’s math-heavy.  I am enjoying seeing the development of normal forms for graph processing languages—it’s the beginning of a new generation of normalization purists.

Comments closed