Kevin Feasel – Page 781

Changing the Slow Query Log Threshold in RDS

Published 2021-08-20 by Kevin Feasel

John McCormack wants to know about those slow queries:

The slow query log will record all queries which are above the threshold level. The default value is 10 (seconds) but you can set it higher or lower depending on your requirements. It is useful for finding slow queries and allows you to pick out candidates for tuning.
If you set the threshold too low, it can increase I/O overhead on your instance and use a lot of valuable disk space. If you set it too high, it might not capture enough useful information.

This is a setting in AWS Relational Database Services and mimics functionality in MySQL

Comments closed

Projecting Disk Space Available

Published 2021-08-20 by Kevin Feasel

Constantine Kokkinos predicts the future:

The first question I wanted to model out was a bigger issue with on-premises databases – when are we going to run out of storage?
Back in the day I’d cheat with msdb backups, comparing compressed sized to actuals, and moving on. However I don’t have a historical reference for Stack Overflow… so what can I do?
Taking a look at the tables we see a commonality in many tables – CreationDate! It looks like the rows faithfully are stamped when they are created.

Constantine does at the end hit on something we tend to forget: most operations in life aren’t quite linear. We often get lucky in that certain stretches are close enough to be linear that we can model them that way, but even in this dataset, you can see the effects of polynomial growth slowly build up. Still, this is a good way of taking us through what an analysis and projection can look like.

Comments closed

When Did That SQL Server Start?

Published 2021-08-20 by Kevin Feasel

Chad Callihan breaks out the stopwatch:

When you are troubleshooting query performance, it’s important to consider when SQL Server last restarted. Each time SQL Server restarts, buffer pool and plan cache get wiped out. This can play a big part in how you approach an issue.
You want to know how much history you have to work with. Not seeing much data in plan cache? Maybe SQL Server restarted recently. Parameter sniffing issues that you were seeing earlier today are now suddenly “fixed?” SQL Server could have restarted between then and now. Let’s take a quick look a few ways to review how long SQL Server has been up and running.

Click through for several techniques. I’ve always checked the tempdb startup time in sys.databases, myself, but that’s not the only way.

Comments closed

Making a Column Chart Better

Published 2021-08-20 by Kevin Feasel

Meagan Longoria improves a visual:

There are some easy opportunities to improve the readability of this chart, so I thought I would use it as an example of how small improvements can have a big impact on a fairly simple chart. I recreated the chart (as best I could) in Power BI and then made two revised versions.

Read on to see what Meagan did and get some advice on the subject.

Comments closed

A Learning Path for Data Science with R

Published 2021-08-19 by Kevin Feasel

Holger von Jouanne-Diedrich has a greatest hits album:

Over the course of the last two and a half years, I have written over one hundred posts for my blog “Learning Machines” on the topics of data science, i.e. statistics, artificial intelligence, machine learning, and deep learning.
I use many of those in my university classes and in this post, I will give you the first part of a learning path for the knowledge that has accumulated on this blog over the years to become a well-rounded data scientist, so read on!

Read on for links to dozens of posts on interesting topics.

Comments closed

Thoughts on Topic Replication in Kafka

Published 2021-08-19 by Kevin Feasel

Jeffrey Carter elaborates on a pair of concepts related to topic replication in Apache Kafka:

Apache Kafka has the default ability to allow a topic to be created on a broker when a message is written to it and when a topic with the name the message is attempting to be written to does not exist. This can be very helpful in early development or prototyping where code, topic names, and schemas are in flux. However, past that early stage, it is recommended that Kafka be configured to disable the auto-creation of topics from messages for a few reasons. In this article I am going to touch on two of these reasons that are also core principles of Kafka partitions and Kafka topic replication.

Read on to understand what partitions and replication factor have to do with all of this.

Comments closed

Ways to Filter Data in PostgreSQL

Published 2021-08-19 by Kevin Feasel

Gauri Mahajan shows off several techniques for filtering data in PostgreSQL:

Data is hosted in a variety of data repositories, one of which is relational databases. Out of tens of commercial and open-source relational databases, one of the most popular open-source relational databases is PostgreSQL. This database is offered on the Azure cloud platform through a service named Azure Database for PostgreSQL. One of the most fundamental operations performed on the database is reading and writing data to consume and host data. It goes without saying that when the data is consumed, it must be scoped based on the requirements or criteria specified by the consumer. This translates to filtering the data while querying it. Like every other relational database, Postgres offers different operators and options to filter data while querying. Let’s go ahead and learn some of the most fundamental ways to filter data hosted in PostgreSQL.

Most of them are the same as what you have in T-SQL, but not everything.

Comments closed

Elastic Data Maps with Azure Purview

Published 2021-08-19 by Kevin Feasel

Wolfgang Strasser has some good news for us:

It’s been a long and intensive discussion – the (initial) pricing structure of Azure Purview. As I already talked about it in my Purview pricing blog post, the basic cost calculation involves
– the cost for the data map (= the infrastructure to store metadata and provide the Purview UI + cataloging functionality)
– plus the costs involved for scanning sources.

And that has added up to a significant amount of money, especially in dev/test scenarios. But read on for the glad tidings Wolfgang has to share.

Comments closed

Power BI Push Tools

Published 2021-08-19 by Kevin Feasel

Marco Russo announces a new set of tools:

Power BI Push Tools is a set of open-source tools to work with Power BI push datasets.
Read Implementing real-time updates in Power BI using push datasets instead of DirectQuery to learn the architecture of solutions based on push datasets and how to create and populate a push dataset starting from a regular Tabular model.

In addition to reading that linked post, click through to see the three tools which are currently available.

Comments closed

Eliminating False Positives when Searching for NOLOCK

Published 2021-08-19 by Kevin Feasel

Aaron Bertrand carries a lantern around Athens, searching for an honest NOLOCK:

In Part 1 of this series, I showed how to identify “NOLOCK in DML” patterns within a single statement using a Visitor pattern. In Part 2, I showed how to find those patterns in procedure and trigger bodies across multiple databases and instances. I still need to show how to eliminate false positives without tedious visual inspection.

Read on to see how we can narrow down the likelihood of a false positive when searching for the worst sorts of NOLOCK queries.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Author: Kevin Feasel