Curated SQL – Page 1399 – A Fine Slice Of SQL Server

SQL In Kubernetes On Docker On Windows

Published 2018-02-08 by Kevin Feasel

Andrew Pruski is two buzzwords away from sending me into sensory overload:

Now, if this is the first time working with Kubernetes you won’t have to perform the next couple of steps but just to confirm, run the following: –
kubectl config current-context
If your shell cannot find the kubectl command, add
C:\Program Files\Docker\Docker\Resources\bin\
to your PATH environment variable and restart your shell.

If the command outputs anything other than docker-for-desktop you will need to switch to the desktop cluster.

Click through to see how to set this up.

Comments closed

Finding Palindromes With T-SQL

Published 2018-02-08 by Kevin Feasel

Chris Hyde has started a new series on palindromes in T-SQL:

Immediately I realized that this algorithm will need to accomplish two different things. I first need to remove all non-alphabetic characters from the string I am testing, because while “able was I ere I saw Elba” is palindromic even leaving the spaces intact, this will not work for other well-known palindromes such as “A man, a plan, a canal, Panama!” Then the second task is to check that the remaining string is the same front-to-back as it is back-to-front.

With the help of Elder’s Dead Roots Stirring album I set out to find the most efficient T-SQL code to accomplish this task. My plan was to avoid resorting to Google for the answer, but perhaps in a future post I will go back and compare my solution to the best one I can find online. For this first post in the series I will tackle only the first task of removing the non-alphabetic characters from the string.

Read on to see how Chris takes on this task.

Comments closed

The SQL Server Backup Survey

Published 2018-02-08 by Kevin Feasel

Mike Fal ran a survey recently and shares his findings:

This leads me to last week. In order to have some data, I decided to run an informal backup survey targeted at the SQL community. The results floored me: 344 of you decided to take my short survey. This really helps me understand some of the trends out there and now I want to share those results with you.

Before I get started, I want to first thank each and every person who responded from the bottom of my heart. This data is the result of your participation. Secondly, I want to underscore the “informal” nature of this. There’s a lot of holes that can probably be poked in the process, but I think the data is still useful and can give people insight into the trends.

I’ve posted raw data along with a few tools out on GitHub, where you are welcome to download and play with it.

Check out Mike’s findings and then dig into the data on GitHub.

Comments closed

Performing A Database Migration With dbatools

Published 2018-02-08 by Kevin Feasel

Viorel Ciucu has a Powershell script using dbatools to configure a new SQL Server instance and migrate databases using TDE over to the new instance:

The second option (and the one we chose) was to leave the encryption enabled. In order to be able to attach the files, or to do restores from the backups you need to have the same certificate that was used for encryption. This certificate is protected by the master key.

To accomplish this:

Make backups of the master key and the certificates
Restore the key and certificates on the new principal and mirror pairs

Read on for the process.

Comments closed

Finding Long-Running Transactions

Published 2018-02-08 by Kevin Feasel

David Fowler has a dream:

It was 3am in the morning and I was asleep and enjoying a delightful dream (I knew it was a dream because I was surrounded by drifting clouds, singing angels and hundreds of softly humming SQL Servers where the hardware had been sensibly provisioned and all code carefully optimised) when I was rudely awoken by a Service Desk call informing me that the systems were unresponsive. A quick check and I could see that everything was being blocked a particular transaction. My suspicion was that someone had run a script which had opened a transaction and then toddled off home without checking that either the script had finished or closed the transaction that it had opened.

My guess was right and killing the transaction got the cogs turning again.

Click through for the script.

Comments closed

User-Defined Functions In KSQL

Published 2018-02-07 by Kevin Feasel

Kai Waehner demonstrates building a user-defined function for Kafka Streams:

As you can see, the full implementation is just a few lines of Java code. In general, you need to implement the logic between receiving input and returning output of the UDF in the evaluate()method. You also need to implement exception handling (e.g. invalid input arguments) where applicable. The init() method is empty in this case, but could initialise any required object instances.

Note that this UDF has state: dateFormat can be null or already initialized. However, no worries. You do not have to manage the scope as Kafka Streams (and therefore KSQL) threads are independent of each other. So this won’t cause any issues.

Click through for the entire process.

Comments closed

Labels And Annotations In ggplot2

Published 2018-02-07 by Kevin Feasel

I have another post in my ggplot2 series:

Annotations are useful for marking out important comments in your visual. For example, going back to our wealth and longevity chart, there was a group of Asian countries with extremely high GDP but relatively low average life expectancy. I’d like to call out that section of the visual and will use an annotation to do so. To do this, I use the annotate() function. In this case, I’m going to create a text annotation as well as a rectangle annotation so you can see exactly the points I mean.

By this point, we’re getting closer and closer to high-quality graphics.

Comments closed

The Year Of The Data Engineer

Published 2018-02-07 by Kevin Feasel

Alex Woodie points out that data science also requires data engineers:

The shortage of data scientists – those triple-threat types who possess advanced statistics, business, and coding skills – has been well-documented over the years. But increasingly, businesses are facing a shortage of another key individual on the big data team who’s critical to achieving success – the data engineer.

Data engineers are experts in designing, building, and maintaining the data-based systems in support of an organization’s analytical and transactional operations. While they don’t boast the quantitative skills that a data scientist would use to, say, build a complex machine learning model, data engineers do much of the other work required to support that data science workload, such as:

Building data pipelines to collect data and move it into storage;
Preparing the data as part of an ETL or ELT process;
Stitching the data together with scripting languages;
Working with the DBA to construct data stores;
Ensuring the data is ready for use;
Using frameworks and microservices to serve data.

Read the whole thing. My experience is that most shops looking to hire a data scientist really need to get data engineers first; otherwise, you’re wasting that high-priced data scientist’s time. The plus side is that if you’re already a database developer, getting into data engineering is much easier than mastering statistics or neural networks.

Comments closed

Tupper’s Self-Referential Formula In Postgres

Published 2018-02-07 by Kevin Feasel

Lukas Eder has a fun post on Tupper’s self-referential formula:

Luckily, this syntax also happens to be SQL syntax, so we’re almost done. So, let’s try plotting this formula for the area of x BETWEEN 0 AND 105 and y BETWEEN k AND k + 16, where k is just some random large number, let’s say
96093937991895888497167296212785275471500433966012930665
15055192717028023952664246896428421743507181212671537827
70623355993237280874144307891325963941337723487857735749
82392662971551717371699516523289053822161240323885586618
40132355851360488286933379024914542292886670810961844960
91705183454067827731551705405381627380967602565625016981
48208341878316384911559022561000365235137034387446184837
87372381982248498634650331594100549747005931383392264972
49461751545728366702369745461014655997933798537483143786
841806593422227898388722980000748404719
Unfortunately, most SQL databases cannot handle such large numbers without any additional libraries, except for the awesome PostgreSQL, whose decimal / numeric types can handle up to 131072 digits before the decimal point and up to 16383 digits after the decimal point.

Yet again, unfortunately, even PostgreSQL by default can’t handle such precisions / scales, so we’re using a trick to expand the precision beyond what’s available by default.

Check it out.

Comments closed

Displaying Items Not Selected In A Power BI Slicer

Published 2018-02-07 by Kevin Feasel

Matt Allington tries to solve the converse of an easy problem:

My idea was that I would load school photos and also the reunion photos onto the one page. The user can then click on a slicer with someone’s name (or any other information about people) and “see” those people highlighted in the photo. I started thinking that I could use the excellent Synoptic Panel from The Italians for this. The only problem I could foresee was that Synoptic Panel is designed to provide shading over an image based on what was selected. I wanted to shade/hide those people that were NOT selected. Anyhow, I love a challenge.

Read on for Matt’s solution.

Comments closed

Curated SQL Posts