2020-07-23 – Curated SQL

Assuming a Role with AWS Powershell Tools

Published 2020-07-23 by Kevin Feasel

I’ve had some issues in the past working with AWS.Tools PowerShell SDK and correctly assuming credentials.
By default, most of the time it was easier to use a dedicated IAM credential setup for the purpose.
However, as I’ve wanted to run some scripts across multiple accounts, the need to simplify by assuming a role has been more important.
It’s also a better practice than having to manage multiple key rotations in all accounts.

Read on to see how far Sheldon has been able to take this, but also how much more work is left to do.

Comments closed

Data Privacy in Confluent Platform

Published 2020-07-23 by Kevin Feasel

David Millman shows off the Privitar Kafka Connector:

The initial message structure, in the left column above, is a simple JSON document with five fields. The middle column contains the list of rules that must be applied, defining the policy. On the right is a sample output message generated as a result of the policy being applied to the initial message.
In the Privitar Policy Manager, a user maps the individual fields to the appropriate rule, as shown in the screenshot below. A rule is applied to each of the fields and the schema is read as a single table structure, named testfile. These rules can be applied for every instance of the schema.

Read on for more.

Comments closed

Backing Up PostgreSQL

Published 2020-07-23 by Kevin Feasel

Valerie Parham-Thompson gives us a few methods for backing up PostgreSQL databases:

There are at least four ways to back up a Postgres database: SQL dump, filesystem snapshots, continuous archiving, and third-party tools.
For each, notes on the recovery point objective (or RPO, which is a measure of how up-to-date your application and business needs will require the data to be) and recovery time objective (or RTO, which is a measure of how quickly the restore needs to be completed after an outage begins) are provided. You should weigh these indicators against your business requirements for the data in question.

Read on to learn more about each of these processes.

Comments closed

SQL Server Management Studio 18.6 Now GA

Published 2020-07-23 by Kevin Feasel

Drew Skwiers-Koballa announces SQL Server Management Studio version 18.6:

The 18.6 release is the second major release of SSMS in 2020 and packs several high impact changes, including a fix for crashes in database diagrams. Key fixes include:
– Save to XEL file error fix.
– Bacpac file import error fix.
– Database diagrams crash fix.
– Addressed sources of three common application hangs.

Lots of bugfixes in here, but there are a few new things as well.

Comments closed

Columnstore Index Maintenance

Published 2020-07-23 by Kevin Feasel

Ed Pollack continues a series on columnstore indexes:

Like with standard B-tree indexes, a columnstore index may be the target of a rebuild or reorganize operation. The similarities end here, as the function of each is significantly different and worth considering carefully prior to using either.
There are two challenges addressed by columnstore index maintenance:
1. Residual open rowgroups or open deltastores after write operations complete.
2. An abundance of undersized rowgroups that accumulate over time

Read on for the full story.

Comments closed

Downgrading SQL Server on Linux

Published 2020-07-23 by Kevin Feasel

Sreekanth Bandarla wants to roll back cumulative updates on Linux:

Of course you can get this info from SQL or several other ways in Linux. Okay, now we know we got SQL Server 2019 CU5 running on this server to work with. Let’s just assume CU5 broke something in my database and I want to go back to CU4. How do I do that?

Click through to see how to do this for Red Hat (or any system using yum). Debian-based don’t have a downgrade option, but you can use apt-get install mssql-server=[version number] instead.

Comments closed

Splitting Data with T-SQL

Published 2020-07-23 by Kevin Feasel

Chris Hyde shows a few techniques for splitting out data into training, testing, and validation sets:

We see right away that this method failed horribly as all of the data was placed into the same dataset. This holds true no matter how many times we execute the code, and it happens because the RAND() function is only evaluated once for the whole query, and not individually for each row. To correct this we’ll instead use a method that Jeff Moden taught me at a SQL Saturday in Detroit several years ago – generating a NEWID() for each row, using the CHECKSUM() function to turn it into a random number, and then the % (modulus) function to turn it into a number between 0 and 99 inclusive.

I’d have to test it out, but I’d think you could modify method 3 to include a CROSS APPLY to perform one ABS(CHECKSUM(NEWID()) and get exact counts that way without a temp table.

Comments closed

Quality Azure Data Studio Extensions

Published 2020-07-23 by Kevin Feasel

Randolph West vouches for some Azure Data Studio extensions:

It’s worth mentioning that for the most part Azure Data Studio extensions are extremely lightweight, both in download size and memory usage. Installing this many on SQL Server Management Studio (SSMS) would slow it down dramatically.
Note: not all extensions can be installed from the Extensions pane. For many of them you must visit a website, download the VSIX file and install it manually using the File > Install Extension from VSIX Package menu option. In most cases you can trust extensions from reputable publishers, but always take care.

Randolph has quite a few more extensions than I do, but I can’t say any of those are a bad choice.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Day: July 23, 2020