Press "Enter" to skip to content

Month: September 2018

Why Graph Text Should Be Horizontal

Stephanie Evergreen explains why you don’t want diagonal or vertical text in your charts:

In languages based on the Latin alphabet, we read horizontally, from left to right. Reading on a diagonal produces cramped necks. Reading vertical text is just not going to happen. So, as much as possible, the text in our graphs should be horizontal.

Let’s walk through a demo. I was trying to combat my sense of hopelessness about the world by exploring the latest dataviz related to the Sustainable Development Goals (don’t we all?). The vast majority of their viz is pretty awesome, especially given how complicated the data can be. I saw this graph about how few developing countries have representation on international development councils.

Watch as Stephanie starts with a column chart with diagonal (and cut-off) text and converts it to a visual which is much easier to read.

Comments closed

Range Locks On Multi-Table Indexed Views

Erik Darling looks at the kinds of locks taken when updating an indexed view:

So what causes Range Locks? Just ask Sunil. He knows everything (this assumes the serializable isolation level):

Equality Predicate

If the key value exists, then the range lock is only taken if the index is non-unique. In the non-unique index case, the ‘range’ lock is taken on the requested key and on the ‘next’ key.

If the ‘next’ key does not exist, then a range lock is taken on the ‘infinity’ value. If the index is unique then a regular S lock on the key.

If the key does not exist, then the ‘range’ lock is taken on the ‘next’ key both for unique and non-unique index.

If the ‘next’ key does not exist, then a range lock is taken on the ‘infinity’ value.

Range Predicate (key between the two values)

‘range lock on all the key values in the range when using ‘between’

‘range’ lock on the ‘next’ key that is outside the range. This is true both for unique and non-unique indexes. This is to ensure that no row can be inserted between the requested key and the one after that. If the ‘next’ key does not exist, then a range lock is taken on the ‘infinity’ value.

Erik has an interesting example and lets us see a potential concurrency problem with multi-table indexed views.

Comments closed

Encrypting SQL Server Connections

Jamie Wick has a great post showing how you can encrypt connections to SQL Server:

So, a question that should be asked is: How secure are your client connections? Here are a couple common misconceptions about SQL server client connections.

Misconception: Usernames & passwords (SQL or Windows) are used to connect to SQL server databases, which means the client-server connection is secure.

Explanation
Usernames & passwords are used to control who has what level of permission (read/write/modify) to the data & database. By default, the information being transmitted is not encrypted. As John Martin shows in this article, it is relatively easy for someone with access to a network (wireless access point or LAN connection) to read the unencrypted data that is being sent between a SQL server and client.

Definitely read the whole thing.  We’re at a point where the overhead cost of encrypting connections is low enough that there’s not much reason to leave production servers transmitting openly over the wire.

Comments closed

SQL On Linux: Common Active Directory Login Issues

Dylan Gray and Tejas Shah continue their troubleshooting series for SQL Server on Linux integrations with Active Directory:

1. When a user performs an AD connection, internally the user connects to a service principal name (SPN). The SPNs are in the form “MSSQLSvc/host.contoso.com:1433”, and they must be registered when setting up AD logins for SQL Server on Linux. When a client app requests a connection (e.g. sqlcmd), it takes the server users wish to connect to, prepends “MSSQLSvc/” and appends “:**<port>**”, and this is the SPN which the connection attempts to authenticate with.

So, if user connects with “sqlcmd -E -S host.contoso.com”, it authenticates with the SPN “MSSQLSvc/host.contoso.com:1433”, and everything succeeds. If user connects with “sqlcmd -E -S host”, it authenticates with “MSSQLSvc/host:1433”. If the SPN the client is authenticating with does not exist, the connection will fail. So, if the SPNs in the mssql.keytab are only “MSSQLSvc/host.contoso.com:1433”, users can only connect to “host.contoso.com”, not “host” and not to the IP. If user needs to be able to connect with variations of host name and IP address, then all appropriate SPNs should be created and configured in the mssql.keytab file.

Read on for more common issues and their solutions.

Comments closed

Using AT TIME ZONE In SQL Server

Randolph West looks at the AT TIME ZONE clause when working with a specific time zone in SQL Server:

The time zone name is taken from a list maintained in the following Windows registry hive: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones

Note: For SQL Server on Linux (and Docker containers), a registry shim performs the same function as the Windows registry by intercepting the API calls and returning the expected value(s).

We can also use a Transact-SQL (T-SQL) query against the system view sys.time_zone_info, which uses the information from the registry hive. This is the recommended method if you do not have access to the registry hive.

Click through for a couple of examples.

Comments closed

Using NUnit For SQL Server Integration Tests

Ben Jarvis shows us how to use NUnit to perform integration testing with SQL Server stored procedures:

I wanted a way to automate the integration testing of my repositories and stored procedures so I developed the solution described below using NUnit as the test framework and SQL Server LocalDB as the database to run my tests against.

I had the following requirements for my solution which NUnit has been able to satisfy:

  • Quick – tests should run quickly and not require massive amounts of set up / tear down

  • Independent – all tests should be independent from one another and responsible for their own set up / tear down

  • Simple – the test code should be simple to understand and easy to work with when writing new tests.

  • Work Everywhere – the tests should be able to work anywhere and not require huge dependencies like a full SQL Server instance, they should be able to work with SQL LocalDB

Read on for the solution.

Comments closed

Dashboard Conversations In Power BI

Teo Lachev points out something pretty new to Power BI:

I’ve noticed that the dashboard conversations are now available. Just open a Power BI dashboard and click the Comments menu. This will open a Comments pane when you can post comments related to the entire dashboard. You can also post comments for a specific tile by clicking the tile ellipsis menu and then choosing “Add a comment”. You know that a tile has comments when you see the “Show tile conversations” button that floats on the tile. Clicking this button brings to the Comments pane to see and participate in the discussion.

Click through for an example of what this looks like.

Comments closed

R: Passing A Formula To lm

John Mount has a new R tip, this time around passing a formula and seeing that formula later:

This works, and the paste() pattern is so useful we suggest researching and memorizing it.

However the “call” portion of the model is reported as “formula = f” (the name of the variable carrying the formula) instead of something more detailed. Frankly this printing issue never bothered us. None of our tools or workflows currently use the model call item, and for a very large number of variables formatting the call contents in the model report becomes unweildy. We also already have the formula in a variable, so if we need it we can save it or pass it along.

There is a much better place on many models to get model structure information from than the model call item: the model terms item. This item carries a lot of information and formats up quite nicely:

format(terms(model))
# [1] "mpg ~ cyl + disp + hp + carb"

Be sure to check out the comments too, as there are several solutions to this problem.

Comments closed

Lists In R

Dave Mason continues his process of learning about R data structures with a survey of lists:

In previous lessons, we’ve noted vectors and matrices consist of data elements of the same class. R will coerce data elements to a single class if we attempt to create a vector or matrix with data elements of differing classes. Lists, on the other hand, can hold data elements of different classes, such as the integer, character, or logical class. In fact, a list can hold most anything in R, including vectors, matrices, and many more! None to my surprise, lists can be created with the list() function:

And if you want to work with lists, purrr is a great package to learn.

Comments closed

Using JSON In Azure Data Lake Analytics

Jeffrey Verheul shows how to register .NET assemblies in Azure Data Lake Analytics:

The power of Azure Data Lake is that you can use a variety of different file types to process data (from Azure Data Lake Analytics). But in order to use JSON, you need to register some assemblies first.

Downloading assemblies
The assemblies are available on Github for download. Unfortunately you need to download the solution, and compile it on your machine. So I’ve also made the 2 DLL’s you need available via direct download:

Click through for links to the assemblies and instructions on how to register them.  And to continue my long-running joke that every .NET project has as a core requirement Newtonsoft.Json.

Comments closed