Curated SQL – Page 1273 – A Fine Slice Of SQL Server

I totally forgot that with Azure SQL DWH you can pause and resume compute, to save money because it is expensive. Question is how do you go about resuming compute? TSQL is not possible and sure you can do the change via Azure portal but what about PowerShell?

This makes it easy to script out an overnight data load and then pausing the Azure Data Warehouse until the morning when those analysts come in, so that you can save a bit of cash (or a lot, depending upon your DWU utilization).

Comments closed

Comparing Distinctness

Published 2018-03-28 by Kevin Feasel

Michael J. Swart shows several options for comparing whether an attribute’s value is distinct from a parameter:

Check it:
DECLARE @TeamId bigint = NULL,
    @SubTeamId bigint = NULL;
 
SELECT TOP 1 TaskId
FROM tasks
WHERE assignedTeamId IS NOT DISTINCT FROM @TeamId
  AND assignedSubTeamId IS NOT DISTINCT FROM @SubTeamId
Talk about elegant! That’s what we wanted from the beginning. It’s part of ANSI’s SQL 1999 standard. Paul White tells us it’s implemented internally as part of the query processor, but it’s not part of T-SQL! There’s a connect item for it… err. Or whatever they’re calling it these days. Go read all the comments and then give it a vote. There are lots of examples of problems that this feature would solve.

PROS: Super-elegant!
CONS: Invalid syntax (vote to have it included).

This would be nice to have. In the meantime, Michael shows several options which are currently valid syntax.

Comments closed

Rotating Out Partitions

Published 2018-03-28 by Kevin Feasel

Kendra Little explains that there are a couple of models available for partitioned table management:

I recently received a terrific question about table partitioning:

I’m writing up a proposal for my company to start partitioning a 2.5TB table. The idea/need is to control the size of the table while saving the old data. The old data will be moved to an archive database on a different server where the BI guys work with it.

In none of the videos articles I’ve seen is the explanation of how the rolling partition works on a long term daily basis.

Are the partitions reused, like in a ROUND ROBIN fashion?

Or, do you add new partitions each day with new filegroups, drop the oldest partition off – this would be FIFO?

Lots of folks assume the answer here is always #2, simply because there’s a bunch of sample code out there for it.

But option #1 can be simpler to manage when it fits your data retention technique!

Click through to learn more about reusable partitioning.

Comments closed

Second-Order SQL Injection Attacks

Published 2018-03-28 by Kevin Feasel

Bert Wagner explains what he calls second-order SQL injection attacks:

SQL injection attacks that delay execution until a secondary query are known as “second order”.

This means a malicious user can inject a query fragment into a query (that’s not necessarily vulnerable to injection), and then have that injected SQL execute in a second query that is vulnerable to SQL injection.

Let’s look at an example.

Another way of thinking about this is a persisted SQL injection attack, akin to reflected versus persisted cross-site scripting attacks. The fix is, don’t trust unsanitized user input. Just because you put a user’s data into your database doesn’t mean that someone sanitized it, so treat that stuff as unsafe unless you know otherwise.

Comments closed

Avro Schema Compatibility In Kafka

Published 2018-03-27 by Kevin Feasel

Neha Bhardwaj walks us through an error in Kafka:

You might have come across a similar exception while working with AVRO schemas.

Kafka throws this exception due to a compatibility issue since the current schema is not compatible with the earlier schema registered on this topic.

You can check the current schema(s) on the topic using:
curl -X GET <a href=”http://localhost:8081/subjects//versions/”>http://localhost:8081/subjects//versions/

Read on to understand what this error means and how you can fix it if you see it.

Comments closed

Making Hadoop A Relational Database

Published 2018-03-27 by Kevin Feasel

Alex Woodie reports on an offering by a company named Splice Machine:

“Everybody is struggling to figure out how to expose what’s been put on the data lake to the business,” Splice Machine founder and CEO Monte Zweben told Datanami at the recent Strata Data Conference. “Our opinion is that you can take infrastructure that people understand, like relational database management systems, and run them directly on the data lake.”

That’s essentially the message that Splice has been pushing since the peak of the Hadoop frenzy in the 2013-2015 timeframe, and it’s the same message that it’s pushing today. The big difference, according to Zweben, is the maturity level. Splice Machine’s open source technology that essentially turns Hadoop into a distributed ACID-compliant relational database is now ready for primetime. Wells Fargo is arguably its biggest paying customer and production use case, but it has dozens more across financial services, healthcare and other industries.

Feasel’s Law strikes again.

Comments closed

Comparing Hadoop On EC2 To EMR

Published 2018-03-27 by Kevin Feasel

Mark Litwintschik looks at EC2 versus EMR in terms of performance, specifically targeting a solution at less than $3 per hour:

The $3.00 price point was driven by the first method: running a single-node Hadoop installation. I wanted to make sure the dataset used in this benchmark could easily fit into memory.

The price set the limit for the second method: AWS EMR. This is Amazon’s Hadoop Platform offering. It has a huge feature set but the key one is that it lets you setup Hadoop clusters with very little instruction. The $3.00 price limit includes the service fee for EMR.

Note I’ll be running everything in AWS’ Irish region and the prices mentioned are region- and in the case of spot prices, time-specific.

I was a bit surprised at which service won.

Comments closed

Automated Cleanup With Query Store

Published 2018-03-27 by Kevin Feasel

Grant Fritchey discusses Query Store’s automated cleanup and also looks at an interesting question:

Query Store has mechanisms for automatically cleaning your data. It is possible to cause them to break down. While presenting a session about the Query Store recently, I was asked what happened if you set the size of the Query Store below the amount of data currently in the store. I didn’t know the answer, so we tried it. Things got a little weird.

Click through to see how weird.

Comments closed

Unique Indexes Versus Unique Constraints

Published 2018-03-27 by Kevin Feasel

Greg Low argues that you should create unique constraints instead of unique indexes whenever possible:

The CREATE INDEX statement is used to do exactly what its name says, it creates an index. But when you say CREATE UNIQUE INDEX, you are doing more than that; you are enforcing a business rule that involves uniqueness.

I have a simple rule on this. Wherever possible business rules like uniqueness, check values, etc. should be part of the design of the table, and not enforced in an external object like an index.

So, rather than a unique index, I’d rather see a unique constraint on the underlying table.

But that’s where real life steps in. I see two scenarios that lead me to occasionally use CREATE UNIQUE INDEX.

Here’s a third: creating constraints can cause blocking issues. If you already have a large table and Enterprise Edition, creating a unique index can be an online operation (unless you have a clustered columnstore index on the table), but a unique constraint is always a blocking activity.

Comments closed

Backing Up Azure SQL Databases

Published 2018-03-27 by Kevin Feasel

Arun Sirpal enumerates the options we have for backups of Azure SQL Databases:

If you have a business requirement which has a need to retain database backups for longer than 35 days, then you have an option to use long-term backup retention. This feature utilises the Azure Recovery Services Vault where you can store up to 10 years’ worth of backups for up to 1000 databases per vault and 25 vaults per subscription.

There are some guidelines that you need to follow to successful set this up:

Your vault MUST be in the same region, subscription and resource group as your logical SQL Server, if not then you will not be able to set this up.
Register the vault to the server.
Create a protection policy.
Apply the above policy to the databases that require long-term backup retention.

Arun also looks at restoration options.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts