Press "Enter" to skip to content

Day: June 21, 2024

Practical healthyR.ts Examples

Steven Sanderson provides some examples:

Today I am going to go over some quick yet practical examples of ways that you can use the healthyR.ts package. This package is designed to help you analyze time series data in a more efficient and effective manner.

Let’s just jump right into it!

Read on for a few common time series activities, such as testing for stationarity, extracting tends from noise, and performing lagged correlation.

Comments closed

Performing a Ping Sweep in Powershell

Vlad Drumea goes poking around:

This is a brief post containing a a piece of code that I use to do a ping sweep and resolve host names in PowerShell on a /24 subnet.

A /24 subnet refers to the last octet (segment of numbers) in an IP, and it ranges from 1 to 254.
This means that if you provide 100.100.100 to the $FirstThreeOctets variable, you’ll end up pinging every IP between 100.100.100.1 and 100.100.100.254.

This is where I say “Hey, go check out nmap.” I also say “Hey, don’t install nmap on your work machine unless you have explicit approval, so that you don’t get an unexpected visit from security.” Which is something I saw once and decided that wouldn’t be the life for me. But seriously, nmap is an extremely powerful network discovery tool.

Comments closed

Understanding Worktables in SQL Server

Steve Stedman takes a peek at tempdb:

worktable in SQL Server is a temporary structure that the SQL Server Database Engine uses to process certain types of queries. These tables are not explicitly created by users but are generated by SQL Server internally to handle specific operations that cannot be managed directly within memory. Worktables are stored in the tempdb database and are crucial for facilitating complex query execution plans.

Read on for examples of when SQL Server will use worktables and some good ideas when you spot worktables in the wild. They’re not inherently bad, but there are some performance problems you could experience around them.

Comments closed

Don’t Enable TRUSTWORTHY on SQL Server

Jeff Iannucci shares good advice:

If you have ever used our free tool to check SQL Server security, you may have seen the check for the “TRUSTWORTHY database owned by sysadmin” show up as one of the highest of priority items, requiring action. When we started reviewing the security permissions and configurations for our clients’ instances, we didn’t expect to find it very often since TRUSTWORTHY database setting is off by default.

Unfortunately, this has been discovered with some frequency, and when combined with a few other common practices, it presents a tremendous vulnerability to escalate privileges for both authorized users and hackers.

Read on to learn more about this. And to supplement, I will once again link Solomon Rutzky’s outstanding guide on the topic.

Comments closed

Approximate Percentiles in SQL Server 2022

Chad Callihan tries out a big improvement:

How do you go about finding the median percentile of a data set? What if you need the top x percentile? Both the APPROX_PERCENTILE_CONT and APPROX_PERCENTILE_DISC functions can be used to solve these questions.

Let’s look at how we can use each and what makes them unique.

The approximate percentiles are guaranteed to be accurate to within a certain percentage, something like 3-5%, if I remember correctly–it’s higher than HyperLogLog’s ~2.5% but not so large as to be of low value. If you’ve ever tried to calculate a median or other percentile like the 75th or 95th percentile, you might have used PERCENTILE_CONT() in the past. At least until you get a few million rows in the table, at which point you stopped using it. My joke is, once you reach a certain table size, PERCENTILE_CONT() becomes so slow that it’s faster to install and configure SQL Server ML Services, learn R or Python, and send in the data to calculate a percentile than to wait for PERCENTILE_CONT() to complete.

The APPROX_PERCENTILE_* series is way, way faster. On reasonable-sized test cases of a couple million rows or so, my recollection is two orders of magnitude better performance, so long as you can deal with being off by a few percentage points. One of the best scenarios for something like this is calculating 95th percentile response times. Does it really matter that the actual response time was 187.5ms and SQL Server said 192.6 or 181.4? Probably not—you get a good idea of the magnitude, and that’s the important part here.

Comments closed