Press "Enter" to skip to content

Curated SQL Posts

Understanding the Bayesian Nature of Kalman Filters

Holger von Jouanne-Diedrich gives us an interesting interpretation of Kalman filters:

The Kalman filter is a very powerful algorithm to optimally include uncertain information from a dynamically changing system to come up with the best educated guess about the current state of the system. Applications include (car) navigation and stock forecasting. If you want to understand how a Kalman filter works and build a toy example in R, read on!

The following post is based on the post “Das Kalman-Filter einfach erklärt” which is written in German and uses Matlab code (so basically two languages nobody is interested in any more 😉 ). This post is itself based on an online course “Artificial Intelligence for Robotics” by my colleague Professor Sebastian Thrun of Standford University.

In fairness, I regret only one thing about learning German: that I’ve forgotten so much over the years.

Comments closed

Showing Dates in Reverse Order on a Matrix

Gilbert Quevauvilliers solves an annoyance:

While doing some work I had a matrix that I had created. What I found was that when I had more Yr-Month values that could fit in the matrix due to having the scroll bar it would show me the older Yr-Months and not the latest ones as shown below.

I always like to think what the experience would be like for the report consumer. And if I was using this report to have to scroll each and every time to see the latest data would be a pain to say the least.

What I came up with was a relatively easy solution but made the reporting experience that much better. Where it will display the Yr-Months in descending order (The current month first)

Below are the steps I completed to achieve this.

By default I like to have dates moving from left to right, as it’s easier for a native-English speaking environment to interpret time that way. But in cases like Gilbert’s, I’d rather have glanceability, seeing the most important data without any sort of scrolling, clicking, zooming, or manual operation. The ideal might be to have the tool give us the option to start scrolled all the way to the right, but in lieu of that, Gilbert provides a nice second-best solution.

Comments closed

Azure Elastic Jobs Now GA

Niko Neugebauer gives into Azure Elastic Jobs, now generally available:

They key feature that needs to be well understood and which points to the potential of the Elastic Job Agent is that you are in no way limited by your own Azure SQL Database, nor by the logical Azure SQL Server where this database is located (contrary to the MSDB Database on the SQL Server), nor will you be limted by the Azure Region, Azure Resource Group or even Azure Subscription – you can configure the Elastic Job that will be reaching out to potentially any Azure SQL Database (given the necessary settings & permissions are correctly configured).

Read the whole thing. Niko shares some interesting thoughts on how it works, how you can tie your one server to a whole host of SQL Databases, and a wish list on what should come next.

Comments closed

Trust and Warehouse Data

Rob Farley explains one way that people might lose trust in your warehouse data:

The scenario is that there’s a source system, and there’s a table in a warehouse that is being used to report on it. Maybe it’s being populated by Integration Services or Data Factory. Maybe it’s being populated by T-SQL. I don’t really care. What I care about is whether the data in the warehouse is a true representation of what’s in the source system.

If it’s not a true representation, then we have all kinds of problems.

Mostly, that our warehouse is rubbish.

Read on for an example of how this might occur and what you can do to prevent it.

Comments closed

Locks in SQL Server

Dan Jackson provides us a primer on how SQL Server implements locking:

We have all heard of locks and can probably even name a few; shared or exclusive locks, for example. We notice them more when something goes wrong and we run into blocking or other performance problems, but what are locks and how do they work in SQL Server?

In this blog, I aim to give a basic answer to that question and provide you with an overview of the different lock modes in SQL Server and how it all works…

Let us make it clear from the off; locks are an essential part of SQL Server. In a multi-user system, there will be many users who wish to access the same resources at the same time. This means that SQL Server must have measures in place to handle concurrency and prevent adverse side effects. Locking is one of those measures.

Read the whole thing.

Comments closed

Downsides to Optimization in Spark SQL

Anuj Saxena takes us through some of the pros and cons of using the Catalyst Optimizer in Spark, including a couple of issues:

I am sure the optimizations make the calculation time very short and these optimizations are implemented in such a way that you just have to provide the logic and everything else will be done in abstraction. But as my friend and colleague Ramandeep says “Abstract features come with abstract issues”. So following are the few issues which I have faced in my recent interaction with Spark SQL:

1. Too large of a query to be stored in memory
2. Implicit optimizations interfere with partitioning

Click through for examples of this.

Comments closed

R and the TIOBE Index

Alex Woodie notices a change in fortunes for R:

Don’t look now, but R, which some had written off as a language in terminal decline in lieu of Python’s immense and growing popularity, appears to be staging a furious comeback the likes of which IT has rarely seen.

According to the TIOBE Index, which tracks the popularity of programming languages (as expressed in Web searches), R has risen an unprecedented 12 spots, up from number 20 in the summer of 2019 to number 8 on its list today.

I’m happy to see this, as frankly, I think R’s a better language for statistical analysis and data visualization than Python and it’s not close. That’s the advantage of being a DSL: you get to focus on doing one or two things really well, and for R that’s statistical analysis and data visualization.

Comments closed

Web Page Scraping with R and ML Services

Dave Mason shows how you can scrape webpages with R and pull the resulting data into SQL Server using Machine Learning Services:

For this post, it might make more sense to skip ahead to the end result, and then work our way backwards. Here is a web page with some data: Boston Celtics 2016-2017. It shows two HTML tables (grids) of data for the Boston Celtics, a professional basketball team. The first grid lists the roster of players. We will scrape the web page, and write the data from the “roster” grid to a SQL Server table.

Read on for a demonstration of the process.

Comments closed