Press "Enter" to skip to content

Day: February 21, 2023

Matrix Multiplication in R with DuckDB and SQLite

Karsten Weinert compares two databases:

On my laptop with 16 GB RAM, I would like to perform a matrix-vector multiplication with a sparse matrix of around 10 million columns and 2500 rows. The matrix has approximately only 2% non-zero entries, but this are still 500 million numbers and the column/row information, too large to work comfortably in-memory.

A while ago, I tried using sqlite for this task. It kind of worked, but was too slow to be useful. This weekend, I revisited the problem and tried using duckdb.

Read on for the results. I’ve heard enough positives about DuckDB over the past few weeks that it makes me want to try it out. H/T R-Bloggers.

Comments closed

The Let Operator in KQL

Robert Cain continues a series on KQL:

Let me tell you about let, my favorite operator in the Kusto Query Language. Why my favorite?

It is extremely flexible. It lets you create constants, variables, datasets, and even reusable functions. Let me tell you, it’s very powerful.

My big problem with let, specifically with variable creation, is that the variables do not persist between batches. You can use variables between statements but only if you execute all relevant statements in one batch. This makes it harder for exploratory query building.

Comments closed

Halloween Protection and Non-Clustered Indexes

Jared Poche digs further into Halloween protection:

I find myself talking about the Halloween Problem a lot and wanted to fill in some more details on the subject. In short, the Halloween Problem is a case where an INSERT\UPDATE\DELETE\MERGE operates on a row more than once, or tries to and fails. In the first recorded case, an UPDATE changed multiple rows in the table more than once.

So let’s take a look at an example using a publicly available database, WideWorldImporters.

Read on for a case of Jared starting from the known and moving into the unknown.

Comments closed

Checking if a TCP Port Is in Use

Tom Collins knocks on doors:

Question: SQL Server won’t start , so checked Event Viewer and getting the following message. 

Server TCP provider failed to listen on [ ‘any’ <ipv4> 50010]. Tcp port is already in use.

How can I check if the port is already in use and which other process or service  has locked the port and therefore not allowing SQL Server to start on designated port ?

Read on to see how you can use Powershell to find the answer on Windows.

Comments closed

Getting Started with Azure Synapse Analytics

Shabnam Watson gets us started with Azure Synapse Analytics:

In this blog post, I show you how easy it is to start an Azure Synapse Analytics workspace (instance) and use its Serverless SQL Pool engine to analyze sample publicly available data. As you will read shortly, Azure Synapse Analytics provides many compute engines for different use cases. The easiest one to get started with is its Serverless SQL Pool since every Azure Synapse Analytics instance comes with one already created and ready to use. It also does not have any cost unless if you use it which makes it very attractive to those who have a limited Azure budget.

Click through to see how to create a workspace, load some data, and query it via the serverless SQL pool.

Comments closed

Lessons Learned from Creating Database Projects

Olivier Van Steenlandt shares some hard-earned knowledge:

Almost 5 years ago I made the switch from “traditional” database development using SQL Server Management Studio to a more flexible way of development by using Database Projects and Source Control. In the first few years, I worked with BitBucket as my code management system and for 2 years I’m using Azure DevOps. In my spare time, I’m using GitHub as well.

During this transition, I came across a couple of bumps, because I wasn’t familiar with Database Projects and I only had a notion about Source Control (Git). In this blog post, I will describe my journey and the lessons learned during the process.

Click through for several tips.

Comments closed

The Observer Effect with Extended Events

Jonathan Kehayias measures the measurer:

SQL Server 2022 offers a new feature enhancement to Extended Events that allows it to now track the performance and publishing metrics of the events that have been enabled in an event session that is running on the server. Four new columns were added in the sys.dm_xe_session_events DMV in SQL Server 2022 that provide additional information about the event publishing performance metrics when an event session is running:

This fits more in the “wacky ideas” category than a sensible thing to do, but it can give you a better idea of how expensive certain events are.

Comments closed

Checking for Permissions on a Database User

Chad Callihan keeps misplacing those permissions:

I recently encountered an unusual permissions issue with multiple databases. New databases were not including all of the permissions that were supposed to be set following database restores. At the time, I wasn’t sure if the permission was being granted and then revoked or not granted at all. I wanted a script I could run to definitively show that permissions did exist and also have proof for myself that, if permissions seemingly vanish later on while testing, I know they were present at one point in time.

Click through to see what Chad plans to use to see if permissions disappear later. This will work with directly granted permissions on a user, so you will miss out on some chained permissions coming as a result of being in a Windows group or user-defined application/database role.

Comments closed