Kevin Feasel – Page 687

The Practical Costs of Index Fragmentation

Published 2022-03-29 by Kevin Feasel

Tibor Karaszi digs into index performance:

See numbers and diagrams at the end, or at the top. I measured a few cases: the difference between no external fragmentation and severe external fragmentation (over 99%). I have both a narrow index and a wide index, and I read one (1), 10,000 and 100,000 rows using index searches (“range scan”). There were obviously no difference reading 1 row so I exclude that from my discussion below. For the other cases the extra time with an extreme level of external fragmentation is (from lowest impact to highest) 7%, 10%, 13% and 32%. The highest number (32%) is when reading many rows from a narrow index, i.e. many rows per page. Again, this is with an extreme level of fragmentation.

What’s interesting is that for the most part, there’s a negligible difference between ~0% internal fragmentation and ~99% internal fragmentation. The follow-on question is, how much are defrag operations costing you in performance and when is the benefit worth the cost?

Comments closed

Finding Free Space in a SQL Server Filegroup

Published 2022-03-29 by Kevin Feasel

John McCormack does some digging:

I just realised that in all my scripts that I use on a regular basis, I didn’t have one for working out free space in SQL Server filegroups. It’s not something that comes up too often but it’s handy to know. For methods of working out space in individual files, you could refer to this post on mssqltips.

Click through for the query and congrats to John on 100 posts.

Comments closed

Determining if SQL Server Needs More Memory

Published 2022-03-29 by Kevin Feasel

Erik Darling breaks Betteridge’s Law of Headlines:

In this post, we’ll talk about how to figure out if your SQL Server needs more memory, and if there’s anything you can do to make better use of memory at the same time.
After all, you could be doing just fine.
(You’re probably not.)

I have a simple flow chart: do you have all of the memory created since the 1990s? If not, then you need more memory. If so, may I please borrow a cup of RAM?

Comments closed

The Performance Cost of CAST/CONVERT in a WHERE Clause

Published 2022-03-29 by Kevin Feasel

Monica Rathbun does the math:

Remove CONVERT/CAST from your WHERE clauses and JOINS when comparing to variables of different data types. Set their data types to match your table definitions before using them as a filter. Optimizing your queries this way will greatly reduce the amount of CPU time, reads, and I/O generated in your queries and allow your code to take better advantage of indexes.

This can quietly be a major performance issue.

Comments closed

Replacing Zookeeper in Kafka

Published 2022-03-28 by Kevin Feasel

Guozhang Wang explains the decision-making behind a major change in Apache Kafka:

Why replace ZooKeeper with an internal log for Apache Kafka^® metadata management? This post explores the rationale behind the replacement, examines why a quorum-based consensus protocol like Raft was utilized and altered to become KRaft, and describes the new Quorum Controller built on top of KRaft protocols.

Click through for the reasoning, which includes a considerably faster shutdown in large environments..

Comments closed

Azure ML Well-Architected Framework Review

Published 2022-03-28 by Kevin Feasel

Ben Brauer has good news:

Microsoft offers prescriptive guidance called the Well-Architected Framework that optimizes workloads implemented and deployed on Azure. This guidance has been generalized for most workloads and creates a basis for reliable and secure applications that are cost optimized.
We have begun to build on this base content set to include more precise guidance for specific workload types, such as machine learning, data services and analytics, IoT, SAP, mission critical apps, and web apps. Machine Learning was the first branch from the base content, which came into fruition in the Fall of 2021.

In case you have never used the Azure Well-Architected Review assessment tool, it’s really useful. It can take hours (or days) to go through the review but if you take it seriously and have the right people in the room giving answers, you’ll get concrete guidance on how to optimize your Azure-based solutions.

Comments closed

Data Visualization in Python

Published 2022-03-28 by Kevin Feasel

Mehreen Saeed uses a few data visualization libraries in Python:

Data visualization is an important aspect of all AI and machine learning applications. You can gain key insights of your data through different graphical representations. In this tutorial, we’ll talk about a few options for data visualization in Python. We’ll use the MNIST dataset and the Tensorflow library for number crunching and data manipulation. To illustrate various methods for creating different types of graphs, we’ll use the Python’s graphing libraries namely matplotlib, Seaborn and Bokeh.

Bokeh results can look really nice, although it does feel like it requires a lot more developer time and effort to get it right. Click through for examples of each of the three libraries.

Comments closed

Creating an Azure Redis Cache

Published 2022-03-28 by Kevin Feasel

Arun Sirpal continues a series on Azure Redis:

Remember – basic should never be used for production. Also, if you need dedicated service then you will not want C0 because this is based on shared infrastructure. Redis can get expensive but could be cost – effective especially if you design to use a multi app approach per cache.
I select P1 – Premium with 6GB cache just to talk a couple things through.

As a note, 6GB of cache is a lot in most environments. That’s because your average cached element size in Redis should be measured in single-digit or double-digit bytes, not kilobytes. You’re typically caching individual values, not entire documents, so if you average 64 bytes per cached key-value combo, you can get somewhere around 90 million values in cache at a time. The database call savings add up quickly, considering a really simplistic estimation: if the average number of queries before expiration for a cached item is 3, a single “cycle” of caching saves you about 270 million database calls. That can allow you to downscale your relational databases considerably, saving a lot of money in the process. There’s a lot of hand-waving I’m doing in the math and a lot of complexity I’m wiping away, but both of those tend on average to make the cache more effective, not less.

Comments closed

Using the Power BI Scanner

Published 2022-03-28 by Kevin Feasel

Gilbert Quevauvilliers sets scanners to On:

As mentioned in my previous blog post this is part 1 of the series where I am going to show you how to use the Power BI Scanner to get the App workspace data. I am also going to mention that the Power BI Scanner from PowerBI.Tips and Tommy Puglia (Twitter) has a wealth of other awesome information for your Power BI tenant.
Fortunately I do not have to go through all the steps on setting up and getting the Power BI Scanner data, you can do it by following the blog post already created with some amazing details here: Using the Power BI Scanner API to Manage Tenant’s Entire Metadata

Check out that article but Gilbert also has some nice tips.

Comments closed

Time Zones and Extended Events

Published 2022-03-28 by Kevin Feasel

Tomas Zika answers a question:

I’ve helped answer another question that appeared on the SQL Server Slack:
Are timestamps in XE event files you view in SSMS local or server time?
To test this, I need a server in a different timezone than the client (SSMS). I find the quickest and most easy tool for that to be containers – more specifically, Docker.

Click through for the answer, as well as a few Docker-related incidentals.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Author: Kevin Feasel