Press "Enter" to skip to content

Month: October 2020

Azure SQL Championship

Mala Mahadevan announces a contest:

Learning can be drudgery, it can also be fun. One of the fun ways to learn Azure is to take part in Azure SQL Championship – a joint attempt by Microsoft and PASS to promote Azure learning. From October 12-30, there will be daily quizzes/simple challenges to solve. If you do it right you have a chance to win some fabulous prizes as below:

Read on to learn more, including the prizes on offer.

Comments closed

Making Use of Sort Rewinds: Closest Match

Paul White follows up on an article:

In When Do SQL Server Sorts Rewind? I described how most sorts can only rewind when they contain at most one row. The exception is in-memory sorts, which can rewind at most 500 rows and 16KB of data.

These are certainly tight restrictions, but we can still make use of them on occasion.

To illustrate, I am going reuse a demo Itzik Ben-Gan provided in part one of his Closest Match series, specifically solution 2 (modified value range and indexing).

Click through for the explanation.

Comments closed

Probability Distributions in Real Life

Stephanie Glen gives us examples of where specific probability distributions appear naturally:

If you’re in the beginning stages of your data science credential journey, you’re either about to take (or have taken) a probability class. As part of that class, you’re introduced to several different probability distributions, like the binomial distributiongeometric distribution and uniform distribution. You might be tempted to skip over some elementary topics and just scrape by with a bare pass. Because, let’s face it–the way probability is taught (with dice rolls and cards) is far removed from the glamor of data science. You may be wondering

When am I ever going to calculate the probability of five die rolls in a row in real life?

Click through for the answer and for a chart provides different scenarios for real-world probability distributions.

Comments closed

When SQL Server Sorts Can Rewind

Paul White turns back the hands of time:

Sorts use storage (memory and perhaps disk if they spill) so they do have a facility capable of storing rows between loop iterations. In particular, the sorted output can, in principle, be replayed (rewound).

Still, the short answer to the title question, “Do Sorts Rewind?” is:

Yes, but you won’t see it very often.

Read the whole thing.

Comments closed

Automating Power BI Report Deployment

Martin Schoombee continues a series on Power BI automation:

Deploying the report is seemingly straight-forward, but there are some risks we need to consider:

– What should we do if the report already exists?
– If the dataset exists, what should we do if there are other reports that use this (shared) dataset?

The last item is a bit of an edge-case that we’ll have to dive deeper into, but let’s look at the basic cmdlet first.

Click through to see how, as well as some thoughts on those risks.

Comments closed

Monitoring Storage Metrics

Robert Sheldon continues a series on storage concepts for the DBA:

When monitoring storage systems, engineers should track a variety of metrics to ensure the systems continue to meet application requirements. Three of the most important and commonly cited metrics are latency, I/O operations per second (IOPS), and throughput. In addition to these three, queue length and I/O splitting can also provide valuable insights into storage performance.

In this article, I discuss all five of these metrics and demonstrate them in action. Despite my focus on these five, they’re not the only important metrics to monitor. For example, engineers should also track storage capacity, device cache usage, controller operations, and storage networks. Even seemingly unrelated components can be a factor, such as low CPU utilization, which can indicate that the processor is waiting on storage to complete requests from the application.

Robert also shows off the new perfmon, which may or may not be better than Perfmon Classic.

Comments closed

Stopping and Starting an Azure Kubernetes Service Cluster

Mohammad Darab wants to save some cash (or at least Azure credits):

I remember when I first started deploying Big Data Clusters, they were on Azure Kubernetes Service utilizing the $200 credit for first time sign ups. By the time I got around to figuring out how to deploy the BDC, not only was my $200 credit gone, but I started to incur cost out of pocket.

If only there was a feature that would allow me to stop the VMs in AKS whenever I wasn’t using them. Well, I’m excited to share that Microsoft AKS (Azure Kubernetes Service) came out with a neat feature (currently in preview at the time of the publishing of this post) that allows you to stop and start your AKS cluster by running a simple command. Of course I had to try it out on BDCs and to my surprise it worked. Well, sort of. Let me explain…

Read on for more information, as well as current limitations.

Comments closed

Mixed MultiSubnetFailover Support on AGs

Andy Mallon continues a line of thought:

In yesterday’s post, I showed how to configure an availability group (AG) to use the RegisterAllProvidersIP=0 when you can’t get clients to connect using the MultiSubnetFailover=true connection string attribute.

I mentioned that you have to make some trade-offs when you set RegisterAllProvidersIP=0, and included this comparison:

But….when if you can eat your cake and have it, too?

In some cases, you’ll have some applications & clients that are not able to use MultiSubnetFailover=true, and other clients that can. Perhaps you’re working on updating a bunch of legacy Java apps to move from old jTDS drivers to the current Microsoft JDBC drivers that properly support MultiSubnetFailover=true. Parts of your codebase have been updated, and you want them to make use of the connection string attribute for fast cross-subnet failover. But other parts of your codebase are still being updated and rely on the RegisterAllProvidersIP cluster parameter to be false. Wouldn’t it be nice to have both?

Read on to learn how.

Comments closed

Database Mail on Azure SQL Managed Instances

John McCormack shows how you can set up database mail from an Azure SQL Managed Instance:

It’s not too difficult to set up database mail for Azure SQL DB Managed Instance in comparison to SQL Server (on-prem or IaaS) however there are a few extra things to consider. This post will describe how to set up database mail for Azure SQL DB Managed Instance. I will use Sendgrid as the mail provider but you can follow the same steps for any other mail provider or your company’s smtp server.

Before I go on, my personal opinion is that including database mail is a massive feature for Managed Instances. The lack of DB Mail on Azure SQL DB Single Database or Amazon RDS is a major blocker to PaaS adoption. Now with Managed Instance, we can have PaaS and database mail.

Read on for the instructions. There’s a little bit more than what you typically would need to do on-premises, but just a little bit.

Comments closed

Xenographs

Alex Velez talks about xenographs:

I recall the first time I came across a horizon chart. Two thoughts came to mind: 1) this looks cool; and 2) I don’t have the energy to figure this out. Fast forward to now. I’ve learned how to read horizon charts, and I’ve even identified a few good use cases for them. This illustrates both the problem and the potential of xenographs. Let’s explore the potentially problematic side first.

Novel approaches to visualizing data can intimidate audiences. They introduce a learning curve because a never-before-seen graph typically requires time and energy to decipher. This obstacle could be enough to dissuade audiences from consuming the data altogether. Even if your audience does invest their time, the resulting conversation is often about reading the visual instead of the primary takeaway. This seems counterintuitive, especially in the explanatory analytics space, but it doesn’t mean we should denounce everything novel.

My response to this depends heavily on the medium. If you’re giving a presentation, a novel or underused chart can be good if it helps tell the story. You have the advantage of being there to explain the dynamics of the diagram for people who have never seen it before. For an informative article, you have some ability to elaborate, as in this bracket win probabilities diagram, which is exactly the type of thing you’d see in certain newspapers and magazines. But unless your visual is immediately intuitive (and I’d consider things like a Manhattan plot or maybe a Dot-boxplot to be intuitive enough for most audiences), I don’t think I would include many of those on public-facing or corporate dashboards, as they’re liable to confuse people and you might not have the space available to explain how this works.

Comments closed