Kevin Feasel – Page 647

Excel Cube Functions and Lambdas for Grouping

Published 2021-08-10 by Kevin Feasel

Chris Webb continues a series on lambda helper functions in Excel:

In the last post in this series I showed how you can use Excel’s new Lambda helper functions to return tables. In this post I’ll show you how you can use them to return a dynamic array of CubeSet functions which can be used to build a histogram and do the kind of ABC-type analysis that can be difficult to do in a regular Power BI report.

Read on to see a pair of examples along these lines.

Comments closed

Troubleshooting a Slow Restore

Published 2021-08-09 by Kevin Feasel

Sean Gallardy performs corporate dentistry:

This came with very little to no data available, and to be quite honest, saying “slow restore” doesn’t really mean much. The initial analysis needs to be an actual set of concrete data that describes the issue, what is normal, and what outliers, if any, exist. Since we have none, we can’t even start to analyze anything, so we need to clarify the problem statement and understand a little more about the issue.

This is an interesting dive into the problem and a good example of how to work with “We won’t let you see/do that” as a consultant. Incidentally, if you haven’t heard of WPR, that comes with the Windows Performance Toolkit.

Comments closed

Using Temporal Tables for Created and Updated Timestamps

Published 2021-08-09 by Kevin Feasel

Daniel Hutmacher has an interesting use case for temporal tables:

You have a table that you want to add “created” and “updated” timestamp columns to, but you can’t update the application code to update those columns. In the bad old times, you had to write a trigger to do the hard work for you. Triggers introduce additional complexity and potentially even a performance impact.
So here’s a nicer way to do it, trigger-free.

Click through for the solution, as well as a warning from Daniel.

Comments closed

Rounding Times in SQL Server

Published 2021-08-09 by Kevin Feasel

Steve Stedman has the low-down on time rounding in SQL Server:

One thing that I end up having to look search on regularly is rounding of dates and times in Transact SQL, having looked this up too many times I finally realized that it is time for me to do my own blog post for it.
First off, whats the difference between rounding and truncating in these examples. Rounding rounds to the closest second, so 10:00:31 is rounded up to 10:01:00, and 10:00:29 is rounded down to 10:00:00. With truncation, it simple changes the truncated area to 0’s. so 10:00:31 gets truncated down to 10:00:00, and so does 10:00:59. Sometimes you may want rounding, and sometimes you may want truncation (floor) for your specific needs.

After having used date_trunc() in Postgres, I’d really like something similar in SQL Server.

Comments closed

Generating Mock Data for SQL Server

Published 2021-08-09 by Kevin Feasel

Chad Callihan has a few options for creating fake data:

It’s easy enough to create a handful of records for testing in SQL Server. What if you want 100 rows or 1000 rows? What if you want data that looks more legitimate compared to gibberish? In this post, we’ll look at different ways to generate mock data.

One of the trickiest things about creating mock data is getting the distributions right. For example, ABS(CHECKSUM(NEWID()) is great (just as RAND(CHECKSUM(NEWID())), but the results follow a uniform distribution because of the nature of checksums and random number generators. This makes charting numeric values look unnatural. Here’s an example I put together of generating data off of a normal distribution. It does take more effort, but if you’re generating this fake data to show it to users in tools like Power BI or Tableau, having data follow reasonable distributions is a good thing. That is, use whatever distribution makes sense for the particular data element: uniform, normal, Pareto (power law), gamma, etc.

Comments closed

Starting SQL: a Video Series

Published 2021-08-09 by Kevin Feasel

Erik Darling wraps up a slew of videos:

Over the last month, I’ve given away all my beginner SQL Server training content. I hope you’ve enjoyed it, and maybe even learned a thing or two.
After this, I’ll be getting back to my regular blogging.

There are a lot of videos to check out, and right now, Erik has a big discount off of his advanced training, so go, go, go.

Comments closed

Monitoring SQL Server on Linux with Telegraf, InfluxDB, and Grafana

Published 2021-08-09 by Kevin Feasel

Amit Khandelwal extends a solution for SQL Server on Windows:

In this blog, we will look at how we configure near real-time monitoring of SQL Server on Linux and containers with the Telegraf-InfluxDB and Grafana stack. This is built on similar lines to Azure SQLDB and Managed Instance solutions already published by my colleague Denzil Ribeiro. You can refer to the above blogs to know more about Telegraf, InfluxDB and Grafana.

Click through for the quick version, and then the step-by-step process.

Comments closed

Updating SQL Server Container Memory Limits

Published 2021-08-09 by Kevin Feasel

Andrew Pruski doesn’t have time to restart containers:

When running multiple SQL Server containers on a Docker host we should always be setting CPU and Memory limits for each container (see the flags for memory and cpus here). This helps prevent the whole noisy neighbour situation, where one container takes all the host’s resources and staves the other containers.
But what if we forget to set those limits? Well, no worries…we can update them on the fly!

Click through to see how you can change the memory limits on a running container.

Comments closed

How Spark Determines Task Numbers and Parallelism

Published 2021-08-06 by Kevin Feasel

The Hadoop in Real World team explains how the Spark engine decides how many tasks to create for a job and how many can run in parallel:

In this post we will see how Spark decides the number of tasks and number of tasks to execute in parallel in a job.
Let’s see how Spark decides on the number of tasks with the below set of instructions.
[… instructions]
Let’s also assume dataset_Y has 10 partitions and dataset_Y has 5 partitions.

Click through for the full explanation.

Comments closed

Helpful Tools for Apache Kafka Developers

Published 2021-08-06 by Kevin Feasel

Dave Klein has a few tools to make working with Apache Kafka a little easier:

We like to save the best for last, but this tool is too good to wait. So, we’ll start off by covering kafkacat.
kafkacat is a fast and flexible command line Kafka producer, consumer, and more. Magnus Edenhill, the author of the librdkafka C/C++ library for Kafka, developed it. kafkacat is great for quickly producing and consuming data to and from a topic. In fact, the same command will do both, depending on the context. Check this out:

Read on for more information on this tool, as well as several others.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel