Press "Enter" to skip to content

Month: April 2021

Displaying Metrics on Graphite Dashboards

Nick Campion takes us through working with Graphite:

Graphite is a free and open-source software. It is used as a time-series database monitoring tool, where you can collect, store and display time-series data in real-time. As you can monitor certain metrics of this data using Graphite, it has a very useful and simple dashboard used to visualize these metrics.

This article will show you how to display a metric on your Graphite dashboard.

Click through for more information.

Comments closed

Kafka Sans ZooKeeper

Ben Stopford and Ismael Juma give us a preview:

So we’re very pleased to say that the early access of the KIP-500 code has been committed to trunk and is expected to be included in the upcoming 2.8 release. For the first time, you can run Kafka without ZooKeeper. We call this the Kafka Raft Metadata mode, typically shortened to KRaft (pronounced like craft) mode.

Beware, there are some features that are not available in this early-access release. We do not yet support the use of ACLs and other security features or transactions. Also, both partition reassignment and JBOD are unsupported in KRaft mode (these are anticipated to be available in an Apache Kafka release later in the year). Hence, consider the quorum controller experimental software—we don’t advise subjecting it to production workloads. If you do try out the software, however, you’ll find a host of new advantages: It’s simpler to deploy and operate, you can run Kafka in its entirety as a single process, and it can accommodate significantly more partitions per cluster (see measurements below).

Read on for more information. This is a big deal for Kafka.

Comments closed

Describing the Physical Join Operators

Deepthi Goguri explains the three physical join operators in SQL Server:

Merge join is not a bad thing and it may be efficient already in your execution plan. What you have to observe when you see the merge joins and performance slow on that plan is to focus on the upstream operations that are going into the merge join. Whether the data is presorted as you already have an index or whether the data is presorted in SQL Server own way then in that case, you can simply check if you can just add that missing column in the index and place in the last key column in the index or use a different join algorithm will be better. The other scenario might be you have lots of duplicate values in your data. If that is the case SQL Server will be using the work tables to handle how the duplicate values can be joined on. So, if you see the duplicate values or using tempdb, then finding the better options will be good.

Click through for more detail. Each physical operator has its place and does quite well within it, but the challenge comes when the optimizer thinks a particular route is better than the one you had in mind.

Comments closed

Comparing CSV to Parquet File Loading Performance in Power BI

Chris Webb has a comparison for us:

Earlier in this series on importing data from ADLSgen2 into Power BI I showed how partitioning a table in your dataset can improve refresh performance. In that post I used CSV files in ADLSgen2 as my source and created one partition per CSV file, but after my recent discovery that importing data from multiple Parquet files can be tuned to be a lot faster than importing data from CSV files, I decided to try creating partitions linked to Parquet files instead.

Click through for the experiment and its results.

Comments closed

Query Plans and Window Functions

Erik Darling has a two-fer here. First, window functions and parallelism:

When windowing functions don’t have a Partition By, the parallel zone ends much earlier on than it does with one.

That doesn’t mean it’s always slower, though. My general experience is the opposite, unless you have a good supporting index.

But “good supporting index” is for tomorrow. You’re just going to have to deal with that.

Second, columnstore behavior with respect to window functions:

Not only is the parallel version of the row mode plan a full second slower, but… look at that batch mode plan.

Look at it real close. There’s a sort before the Window Aggregate, despite reading from the same nonclustered index that the row mode plan uses.

But the row mode plan doesn’t have a Sort in it. Why?

Check out both posts.

Comments closed

Automated File Cleanup with Powershell

Jeffrey Hicks does some spring cleaning:

Springtime is approaching in North America. Where I live, the snow has finally melted and we have blue skies with warmer temperatures. Of course, this means Spring Cleaning. Time to clear out the winter debris and spruce up the house. For me, this is also a good time for some computing housecleaning as well. I don’t know about your Windows environment, but I tend to accumulate a lot of junk. Most of the time I don’t see it, but I know it’s there. While the junk normally doesn’t have a negative impact, I think mentally, I like clearing things out and tidying up. So I pulled out some older PowerShell code, freshened it up, and now I have a set of tools for clearing out junk and temporary folders. Let me show you what I came up with.

Click through for the scripts.

Comments closed

Auto-Failover Groups and Grace Periods

Taiob Ali clears up some misunderstanding:

The auto-failover groups feature for the Azure SQL database can be configured with an automatic failover policy. Azure triggers failover after the failure is detected and the grace period has expired. Grace period is determined by a setting called ‘GracePeriodWithDataLossHours’ that cannot be set under one hour. Why is it not allowed to set a time which is less than an hour? Can your business tolerate the application be down for that period? Should your turn off Auto Fail-over and set it to manual?

I noticed a lot of confusion around this setting, including my own. Some of the confusion is due to a lack of clarity in the documentation. I checked with the Microsoft Azure SQL team, and they are actively working on clarifying some of the questions I raised.

I want to thank Dimitri Furman and Roberto Bustos from the Azure SQL Team for clarifying some of my confusion that I will share here.

Read on for a Q&A style explanation of auto-failover and grace periods.

Comments closed

What to Do if a Database Isn’t Synchronizing

Lee Markup has some advice:

I currently manage 5 AlwaysOn Availability Groups. Two are on SQL Server 2014 and overdue for an upgrade, while three of them are on SQL Server 2017.  From time to time I have run into a couple of different situations that I needed to troubleshoot and I want to tell you where to look and what to check on in these scenarios. I can’t possibly tell you about everything that could go wrong, but I can tell you about my experience with AlwaysOn Availability Groups and let you decide if that experience helps you or not.

Click through for more.

Comments closed

Fixed Server and Database Roles in SQL Server

Greg Larsen takes us through the built-in set of server and database roles in SQL Server:

Managing Security for SQL Server is extremely important. As a DBA or security administrator, you need to provide access for logins and database users to resources within SQL Server. SQL Server has both server and database resources where access might need to be granted. Access to these resources can be granted to either individual logins or database users or can be granted to roles, for which logins or database users can be members. Granting access via a role is known as role-based security.

There are two types of roles: fixed or user-defined. In this article, I will discuss the different fixed server and database roles provided with SQL Server and how these roles can be used to support role-based security to simplify providing access to the different SQL Server resources. In a future article, I will discuss user-defined server and database roles.

Click through for more information. Or just give everybody sysadmin. I mean, that’s probably fine, right?

Comments closed

Power Query and the First Day of the Week

Ed Hansberry decrees that henceforth, Thursday shall be the first day of the week:

By default, Power BI starts its week on Sunday, but that doesn’t work for every situation. Some companies prefer their week start on Monday, and I recently helped someone that needed their week to start on Thursday. Fortunately, by tweaking your date table in Power Query, you can make this change with no calculations at all.

Click through to see how.

Comments closed