Press "Enter" to skip to content

Month: June 2024

Tweedie Distributions and Generalized Linear Modeling

Christian Lorentzen talks about Tweedie distributions:

Tweedie distributions and Generalised Linear Models (GLM) have an intertwined relationship. While GLMs are, in my view, one of the best reference models for estimating expectations, Tweedie distributions lie at the heart of expectation estimation. In fact, basically all applied GLMs in practice use Tweedie distributions with three notable exceptions: the binomial, the multinomial and the negative binomial distribution.

Read on for a bit more about its history and how it ties in with several other distributions.

Comments closed

Emulating the FILTER Clause in Oracle

Lukas Eder notes a bit of ANSI SQL:

The following aggregate function computes the number of rows per group which satifsy the FILTER clause:

SELECT  COUNT(*) FILTER (WHEREBOOK.TITLE LIKE'A%'),  COUNT(*) FILTER (WHEREBOOK.TITLE LIKE'B%'),  ...FROMBOOK

This is useful for pivot style queries, where multiple aggregate values are computed in one go. For most basic types of aggregate function, it can be emulated simply by using CASE expressions, because standard aggregate functions ignore NULL values when aggregating. 

Lukas shows how you can also implement this logic using JSON_TRANSFORM() though I think I’d just as soon stick with COUNT(CASE WHEN BOOK.TITLE LIKE 'A%' THEN 1 END) and try hard not to think about shredding JSON.

Comments closed

Exporting Extended Event Results to CSVs and Tables

David Fowler comes in with a quick tip:

This is something that I’ve seen pop up a couple of times on various forums recently, how can we export the results of an XEvent session. So I thought I’d answer the question here.

It’s actually a very simple thing to do in SSMS, although the option is not always easy to spot.

Click through to see how you can do it in T-SQL, as well as within SSMS.

Comments closed

Leaving the SQL Server Public Role Be

David Seis has some advice:

Roles play a crucial part in managing permissions and access control. One such role that often gets overlooked is the public role. This role, by default, has minimal permissions and every database user is a member. Sometimes, however, there are permissions granted to the public role which puts your data at risk.

David’s focus is not adding new permissions to the public role. I also recommend not taking permissions away. Some security tools will gripe that the public role has CONNECT access. I was foolish enough to follow those tools’ recommendations once and removed CONNECT from public. Then I got a bunch of calls about people not being able to log in… I suppose you can spin a yarn about how there are logins that a person should not be able to use for connections, but how often is that really the case?

Comments closed

Always Encrypted and Secure Enclaves

Pieter Vanhove explains what secure enclaves are and why they’re useful:

Encryption is a vital technique for protecting sensitive data from unauthorized access or modification. SQL Server and Azure SQL Database offer two encryption technologies that allow you to encrypt data in use: Always Encrypted and Always Encrypted with secure enclaves. In this blog post, we will compare these two technologies and highlight their benefits and limitations.

Read on to learn more.

Comments closed

Using Query Store to Fix a Cardinality Estimation Problem

Michael Bourgon solves an issue:

This morning I had a performance issue on a piece of code that worked long ago on a different server, and they were trying to put it in place today.  It was SLOW. Like, 10 minutes slow. With the added bonus that it’s done through a web app, so it never finishes, it just always times out. After dealing with various approaches, I finally tried using the old Cardinality Estimator, and it went from 10 minutes to 3 seconds. But the query is inside the application, it doesn’t call a stored procedure. Which means the devs changing it is Non-Trivial. So I went to an updated version of an old trick – query store hints (which used to be Plan Guides)

Click through for a list of actions Michael took.

Comments closed

Book Review of Bernoulli’s Fallacy

John Mount reviews a book:

First the conclusion: this is a well researched and important book. My rating is a strong buy, and Bernoulli’s Fallacy is already influencing how I approach my work.

My initial “judge the book by its back cover” impression of Bernoulli’s Fallacy was negative. The back cover writes some very large checks that I was initially (and wrongly) doubtful that “its fists could cash.” The thesis is that frequentist statistics (the dominant statistical practice) is far worse than is publicly admitted, and that Bayesian methods are the fix. However, other reviews and the snippets by people I respect (such as Andrew Gelman and Persi Diaconis) convinced me to buy and read the book. And I am glad that I read it. The back cover was, in my revised opinion, fully justified.

Read on for John’s full review of a book that is quite critical of frequentist statistics in favor of Bayesian statistics—so that already makes the book a winner for me.

Comments closed

Distribution Parameter Wrangling in TidyDensity

Steven Sanderson introduces a new set of functions:

Greetings, fellow data enthusiasts! Today, we’re thrilled to unveil a fresh wave of functionalities in the ever-evolving TidyDensity package. Buckle up, as we delve into the realm of distribution statistics!

This update brings a bounty of new functions that streamline the process of extracting key parameters from various probability distributions. These functions adhere to the familiar naming convention util_distribution_name_stats_tbl(), making them easily discoverable within your R workflow.

Read on for the list and an example of how to use them.

Comments closed

Cleaning Up Large System Databases

Josephine Bush doesn’t need enormous system databases:

Always set this on your SQL Servers so you don’t have this problem in the first place. This is in the SQL Server Agent settings. I remember having some agent jobs that used to serve this function that ran on a schedule, which may have been required in older versions of SQL Server.

Josephine focuses on SQL Agent history and database backup history, both of which are good ones. If you have an older version of SQL Server or are using the package deployment model, there may be an explosion of information in msdb regarding SSIS that you’d want to manage. Also, check if any of the databases are in Full recovery mode; if so, ensure that the backup script you’re using for transaction log backups actually backs up system databases.

Comments closed

The securityadmin Role in SQL Server

Jeff Iannucci talks about a role that might as well be sysadmin:

Based on the name, you probably can guess that members of the securityadmin role can make dangerous changes to the permissions of other server principals. What many folks don’t realize is that this role is simultaneously less dangerous and more dangerous than you might think.

Allow me to explain, or better yet show you what that means.

Click through for the explanation.

Comments closed