Press "Enter" to skip to content

Author: Kevin Feasel

Testing Azure SQL Database Failover

Arun Sirpal shows us how easy it is to fail over Azure SQL Database to another region:

So what happens now if I connect to the read/write endpoint? (I test this via SSMS)

The dreaded IP address / create a new firewall rule message. Why? Well this setup utilized a “server” level firewall rule and the server in the US did NOT have the IP address mapped in, you can see from the below screen shot that there are no firewall rules configured.

Fixing this is easy, you could just add the IP address on the secondary server as another server level rule but you should seriously consider using a database level firewall rule, the setup will get replicated to the secondary server making failover experience smoother.

Read on to see how to set this.

Comments closed

Checking AD Group Membership

Matthew McGiffen shows us a way to test if an account is a member of an Active Directory group:

As part of my job I manage a bunch of SQL instances for Development and Test.

Access is managed though Active Directory groups, so I rarely have to do anything regards managing permissions. Nonetheless I often get requests from people to give them access. This is usually for a new starter or someone who has moved from one team to another.

Of course, the answer is usually that they just need adding to the right AD group. Rather than assume though, I always get them to check before I pass the request on to the AD team. You never know, there could be something else wrong.

You can also use xp_logininfo if you want to go the other way and get all of the members of a particular group.

Comments closed

Dealing With International Addresses

Danielle Grenier shows us how Etsy handles internationalization of addresses:

It starts off easy – you fill in your name, your street, your apartment number, and then you reach the field labelled “Post Town”. What is a “Post Town”? Huh. Next you see “County”. Well, you know what a county is, but since when do you list it in your address? Then there’s “Postal code”. You might recognize it as what the US calls a “zip code”, but it’s still confusing to see.

So now you don’t really know what to do, right? Where do you put your city, or your state? Do you just cram your address into the form however you can and hope that you get your order? Or do you abandon your cart and decide not to buy anything from this site?

This is more application-level code than data (and so glides over the underlying database work), but Danielle does link to some good resources for figuring out what non-US addresses look like.

Comments closed

Packages And Functions In R

Ellen Talbot walks us through some of the basics of using packages and functions in R:

A function does some computation on an object. The use of a function consists of:

  1. A function’s name
  2. Parentheses
  3. 0 or more inputs

Each input is provided to an argument or parameter within a function.

These arguments have names, although you don’t often need to provide the names.

You can find out what arguments a function takes by using the code completion and it’s help snippet, or by searching for the function in the Rstudio Help tab.

When you’re inside the brackets of a function you can get the list of available arguments and auto-complete them.

Ellen also includes some useful R libraries for working with and visualizing data.

Comments closed

Building A Basic Kafka Producer

M. Mallikarjun shows us a simple producer in Kafka:

A Kafka producer is an application that can act as a source of data in a Kafka cluster. A producer can publish messages to one or more Kafka topics.

So, how many ways are there to implement a Kafka producer? Well, there are a lot! But in this article, we shall walk you through two ways.

  1. Kafka Command Line Tools
  2. Kafka Producer Java API

You can write producers in quite a few languages.  Java is the example here, but there are several libraries, including a good one for .NET.

Comments closed

What You Can Learn At SQL Saturday

Nate Johnson shares a few things he picked up at the SQL Saturday in San Diego:

This was an interesting and even slightly entertaining session presented by Max @ SQLHA. One analogy that really stood out to me was this:

SANs have become a bit like the printer industry — You don’t pay a lot for the enclosure, the device itself, i.e. the SAN box & software; but you pay through the nose for ‘refills’, i.e. the drives that your SAN vendor gods deem worthy of their enclosure.

It’s frighteningly accurate. Ask your storage admin what it costs to add a single drive (or pair of drives, if you’re using something with built-in redundancy) to your SAN. Then compare that cost with the same exact drive off the retail market. It’s highway robbery. And we’re letting them get away with it because we can’t evolve fast enough to take advantage of storage virtualization tech (S2DSOFSRDMA) that effectively makes servers with locally attached SSDs a superior architecture. (As long as they’re not using a horribly outdated interface like SAS!)

Nate also includes several more interesting lessons.  SQL Saturdays are great for picking up useful knowledge.

1 Comment

How Join Hints Affect Adaptive Joins

Grant Fritchey looks at the combination of adaptive joins and query hints which specify join type:

I’ve highlighted the interesting bit. “Actual Number of Locally Aggregated Rows” is part of aggregation push down, explained by the amazing Niko Negebauer here and here. Basically, the aggregation is occurring with the data access. So while there is a Hash Match operator for the aggregation, actually, the active part of the aggregation was performed within the columnstore. That’s why the Actual Number of Rows coming out of the columnstore index itself is 0, but the number of rows coming out of the Hash Match Aggregate is 441.

So… why not another aggregate push down when we used the hint? Because the hint says, we MUST use a hash join. At that point the optimizer has no choices on where, when, how it does data processing. It must, first, ensure that a hash join is used, so it does. First thing out of the gate, hash join. Then a hash aggregate. This difference in behavior results in a 24% decrease in performance. The only interesting thing is that the reads remained consistent. This means that it was just the processing of the join that added overhead.

Read the whole thing.

Comments closed

Running Totals In Power BI With M

Imke Feldmann gives us a reason to use M to calculate running totals:

Today I want to share a scenario where a running total calculation in the query editor saved a model that run out of memory when done with DAX:

Problem

The model couldn’t be refreshed and returned out of memory error with a calculated column in the fact table of over 20 Mio rows (from a csv-file). A running total should be calculated for each “JourneyID”, of which there were over 1 Mio in the table itself. This rose memory consumption during refresh by over 300 % – until it finally errored out:

Click through for the solution.

Comments closed

When Disabled SQL Agent Jobs Still Run

David Fowler troubleshoots a nasty issue with the SQL Server Agent:

In a previous blog post: Duplicate Agent jobs – A good reason not to meddle with Msdb I explained a situation where someone was updating msdb tables manually rather than using the supplied system stored procedures such as msdb.dbo.sp_update_job, It would seem that this was not the only occasion where I would find myself in the midst of the meddlers’ medley.

This time around the meddler decided to disable the job using an update statement against the msdb.dbo.sysjobs table setting enabled from 1 to 0, you have no idea how long it took me to work that out!! but this lead me onto discovering more about the SQL server agent and its general behaviour.

Short form:  use the built in procedures to modify SQL Agent jobs rather than going off and updating tables on your own.  Click through for the long form, with plenty of interesting details.

Comments closed

Natural Keys?

Steve Jones wonders if we should give up on natural primary key constraints:

One of the things I think is important in modeling your particular entity is including a primary key (PK). In my DevOps talk I stress this, as I’d rather most attendees come away thinking a PK is important as their first takeaway from the session. There are exceptions, but they are rare, and I would prefer that most tables just have some PK included from the beginning.

A PK ought to be stable as well, and there are plenty of written words about how to pick the PK for your particular problem domain. Often I have received the advice that natural keys are preferred over surrogate keys, and it is worth the effort to try and identify a suitable column (or set of columns) that will guarantee uniqueness. I think that’s good advice, and it’s also advice I tend to ignore.

Read on for Steve’s reasoning.  I tend to use surrogate keys out of habit, though I do prefer to put unique key constraints on natural keys to help me reason through data models.

Comments closed