Press "Enter" to skip to content

Month: January 2020

Fun With Database Names

Jason Brimhall takes us through database names you shouldn’t use:

Let’s figure we have a requirement to create a database with sensitive data. Due to the sensitivity of the data, it is classified confidential (for your eyes only, don’t talk about it and plug your ears if somebody starts talking about it). This is so sensitive that an apt name for the database could be anything like 🙈 or 🙉 or 🙊. Being smart, you know there are two more databases coming down the line so you only want to pick one of those for the name and not all three (though all three could make sense for a single database name).

Just because you can doesn’t mean you should…

Comments closed

Gap and Island Analysis

Ed Pollack covers a topic of importance for database developers:

Within a data set, an island of data is any ordered sequence where each row is in close proximity to the rows around it. For some data types and analysis, “close proximity” will mean consecutive. Dates, integers, and letters of the alphabet can be ordered sequentially where two adjacent values will not be able to have additional values in between them.

For example, there are no dates between October 23rd and October 24th. Similarly, there are no integers between 17 and 18 and no English letters between E and F. For these examples, an island of data could be defined as a sequence of consecutive values. A gap can be defined as a sequence of missing values.

There are a lot of difficult problems which gap & island analysis makes much easier by pivoting the way you think about the problem.

Comments closed

Fun With Secure Enclaves

Ned Otter continues a series on SQL Server 2019 Always Encrypted with Secure Enclaves:

In the first post of this series, we explored the requirements for using Always Encrypted with secure enclaves, as well as some of the limitations.

For this post, we’ll be using Powershell to install and configure the HGS server (required for “attestation”) as well as executing the steps required to configure the SQL 2019 server to work with HGS.

Read on for a few disclaimers and a detailed setup article.

Comments closed

Using ksqlDB to Read Twitter Data

Robin Moffatt has a quick demonstration of ksqlDB:

I’m going to show you how to use ksqlDB to do the following:

– Configure the live ingest of a stream of data from an external source (in this case, Twitter)
– Filter the stream for certain columns
– Create a new stream populated only by messages that match a given predicate
– Build aggregate materialised views, and use pull queries to directly fetch the state from these

Let’s dive in! As always, you’ll find the full test rig for trying this out yourself on GitHub.

Comments closed

Chesterton’s Fence in Development Terms

Pete Warden picked a blog post title I couldn’t refuse:

This script came to mind as I was thinking back over the year for a few reasons. One of them was that I spent a non-trivial amount of time writing and debugging it, despite its small size and the apparent simplicity of the problem it tackled. Even in apparently glamorous fields like machine learning, 90% of the work is nuts and bolts integration like this. If anything you should be doing more of it as you become more senior, since it requires a subtle understanding of the whole system and its requirements, but doesn’t look impressive from the outside. Save the easier-to-explain projects for more junior engineers, they need them for their promotion packets.

The reason this kind of work is so hard is precisely because of all the funky requirements and edge cases that only become apparent when code is used in production. As a young engineer my first instinct when looking at a snarl of complex code for something that looked simple on the surface was to imagine the original authors were idiots. I still remember scoffing at the Diablo PC programmers as I was helping port the codebase to the Playstation because they used inline assembler to do a simple signed to unsigned cast. My lead, Gary Liddon, very gently reminded me that they had managed to ship a chart-topping game and I hadn’t, so maybe I had something to learn from their approach?

I am a huge fan of the concept which, made brief, states that if you do not understand why something is the case, don’t change it. If you do understand it, maybe change it but be prudent about it. It’s also something I have often trouble with, as my natural inclination toward code bases is to use the cleansing power of fire to burn it all down.

Comments closed

Registering a Raspberry Pi 4 as an IoT Edge Device

Hasan Savran takes us through turning a Raspberry Pi 4 into an Azure IoT Edge device:

You can buy all type of sensors and connect them to Raspberry Pi. Then you can use Python or .NET Core to write small applications to check your connected sensors and read data from the sensors. If you like to push this data to store or analyze in Azure, then you need to make Raspberry Pi ready by installing couple of applications.
      Installing an application in Windows, is not a big deal for me. I had to install and configure all the applications in Linux in this project. First thing we need to do is copying some files to register Microsoft GPG key and software repository feed. To do that, we will use the curl command. Curl is used for transferring data using various protocols including HTTP/S. We are going to use it to copy some files from Internet to local storage. It’s a fancy copy tool.

There are a few steps involved, but nothing too onerous. I think I know where Hasan is going with this, too.

Comments closed

Azure Synapse Analytics Result Set Caching

Niko Neugebauer takes us through result set caching in Azure SQL Data Warehouse Azure Synapse Analytics:

I just put some result on the output, because as you can imagine there are some certain limits on the amount of the output that will be cached and that will be not. Besides the basic logical stuff, such as having deterministic functions only (functions which output will not be varying depending on the execution), not using System Objects or UDFs (and it seems that scalar UDF inlining is not a part of Azure SQL DW yet), no row-level security or column-level security enabled, the main thing and which seems to be pretty good decision as far as I am concerned – the row size larger than 64KB won’t be cached period.

Read on to see what Niko has learned, including cache performing and limitations. Between this and the data pools in SQL Server Big Data Clusters, Microsoft’s spent some time thinking about data caching in cloud-based versions of SQL Server.

Comments closed

Negative Port Numbers in SQL Server

Jason Brimhall takes us through a bit of a mystery:

Take quick note of the port number I have circled in red. This doesn’t match the original query at all. In fact, it doesn’t come anywhere close to the actual port number. In addition, the port number shown here is a negative value. Obviously a negative port is not correct as TCP/IP ports only range from 0-65535. So what is happening here?

Read on for the answer.

Comments closed