Curated SQL – Page 1065 – A Fine Slice Of SQL Server

The other day someone checked in some code and every now and then the build would fail with the error

Msg 15151, Level 16, State 19, Line 51
Cannot drop the event session ‘ProcsExecutions’, because it does not exist or you do not have permission.

I decided to take a look at the code and saw what the problem was. I will recreate the code here and then show you what needs to be changed. This post will not go into what Extended Events are, you can look that up in the SQL Server Extended Events documentation

I like these IF NOT EXISTS checks on release scripts as that makes them re-runnable. Even if you don’t use continuous integration for release scripts, you may sometimes hit F5 one too many times.

Comments closed

Intelligent Query Processing FAQ

Published 2019-04-15 by Kevin Feasel

Joe Sack answers a number of questions about intelligent query processing in SQL Server:

You have batch mode adaptive joins, but no row mode adaptive joins. Why?
Adaptive joins are more appropriate for scenarios where the join-input row count fluctuates significantly. Batch mode assumes a higher row flow vs. an OLTP low-row typical pattern. Row mode adaptive joins would likely be too prone to regressions. Batch mode on rowstore opens up adaptive joins for scenarios where we estimate higher row counts for join-inputs.

There are some good questions and answers in this set.

Comments closed

Azure SQL Managed Instance Public Endpoints

Published 2019-04-15 by Kevin Feasel

Danimir Ljepava announces public endpoints for Azure SQL Managed Instances:

Public endpoint, ability to connect to Azure SQL Database Managed Instance from Internet, without VPN has reached global availability today. The release of this feature will help support many new integration scenarios.

The public endpoint for Managed Instance can today be enabled/disabled via PowerShell script. The support for Azure portal will be coming within the next week or so as soon as all updates are rolled out.

Click through to learn how to enable it with Powershell.

Comments closed

The Prevalence of Persistent XSS

Published 2019-04-12 by Kevin Feasel

Adrian Colyer has a review of a security-minded paper:

Does your web application make use of local storage? If so, then like many developers you may well be making the assumption that when you read from local storage, it will only contain the data that you put there. As Steffens et al. show in this paper, that’s a dangerous assumption! The storage aspect of local storage makes possible a particularly nasty form of attack known as a persistent client-side cross-site scripting attack. Such an attack, once it has embedded itself in your browser one time (e.g. that one occasion you quickly had to jump on the coffee shop wifi), continues to work on all subsequent visits to the target site (e.g., once you’re back home on a trusted network).
In an analysis of the top 5000 Alexa domains, 21% of sites that make use of data originating from storage were found to contain vulnerabilities, of which at least 70% were directly exploitable using the models described in this paper.

Adrian’s been on a security paper kick the last few days, so be sure to check those out.

Comments closed

CSV Data Ingestion with Spark

Published 2019-04-12 by Kevin Feasel

Jean Georges Perrin shows how you can easily load CSV data with Spark:

Fortunately for you, Apache Spark offers a variety of options for ingesting those CSV files. Ingesting CSV is easy and schema inference is a powerful feature.
Let’s have a look at more advanced examples with more options that illustrate the complexity of CSV files in the outside world. You’ll first look at the file you’ll ingest, and understand its specifications. You’ll then have a look at the result and finally build the mini-application to achieve the result. This pattern repeats for each format.

It’s good to see some of the lesser-used features pop up like date format and multi-line support (which I hadn’t even known about).

Comments closed

The Benefits of Partitioning in CosmosDB

Published 2019-04-12 by Kevin Feasel

Hasan Savran explains why partitioning data in CosmosDB is so important:

In any Introduction level CosmosDB talk, Presenter will suggest you pay more attention to the Partitioning part of the talk. Microsoft wants you to create great solution by using Azure CosmosDB and there are a lot of resources out there for developers. The challenge in Azure CosmosDB is, you don’t need a DBA in CosmosDB and developers may not pay attention to details like Partitioning when they create databases in Azure CosmosDB.

It is crucial to pick the right partition key for your databases in CosmosDB simply because you cannot repartition your databases. If you pick a wrong partition key for your data model, it will be simply too late to change it later. You might have good amount of data in your databases, that means you spent good amount of RU to insert this data and only way to fix this problem will be start from scratch and upload all data back again to CosmosDB with the right partition key.

Read on for a couple of analogies and why this is so important.

Comments closed

SSMS Autorecover

Published 2019-04-12 by Kevin Feasel

Michelle Haarhues takes us through auto-recovery in Management Studio:

Periodically there is a crash, power surge, or sudden reboot of your computer. Of course, you had SQL Server Management Studio (SSMS) open and you were working on something important. That seems to be the only time there is a crash/reboot. You can lose work that that was open in SSMS but has not saved. There is an AutoRecover feature in SSMS so all may not be lost.

Read on to learn more about auto-recovery. It only restores on crashes, though there are third-party plugins you can get to restore after start. Azure Data Studio restores on restart as well, so kudos to the ADS team for that.

Comments closed

When SQL Server Replication Ignores Tables

Published 2019-04-12 by Kevin Feasel

Matt Slocum takes us through a tricky replication scenario (hint, they all are):

There are occasions when Updates, Inserts, and Deletes on a replicated table do not replicate out to the Subscriber. You’ve verified that the table is listed in the Articles included in the Publication, and that there is at least one Subscription on the Publication.

The strange thing is that there are likely other tables in the same Publication that are properly being replicated to the same Subscriber.

What is happening here? Why is replication ignoring this table?

Read on to see Matt’s explanation and fix.

Comments closed

Rollback’s Effect on Identity Columns

Published 2019-04-12 by Kevin Feasel

Adrian Buckman explains that rollbacks on identity columns still burn those identity values:

As I say – This is just what I has seen people do and it was only the other day when I saw a similar situation but with an insert instead, The user believed that because the changes were made within a transaction this would rollback EVERYTHING however they did not consider the impact on the Identity column on the table they made the insert in.
Here is an example to demonstrate how a rollback on an insert will not rollback your identity seed on your table.

Click through for the demo. Sequences behave in practice the same way: once you pull that next sequence ticket, you can’t put it back into the machine just by rolling back the transaction. That’s why identity columns and sequences aren’t good for situations where you absolutely need contiguous data, such as invoice numbers or check numbers.

Comments closed

Persistent Storage and Kubernetes

Published 2019-04-12 by Kevin Feasel

Chris Adkin explains the concepts behind persistent storage in containers:

A question that often crops up is “Can I use local storage”, the answer is “It depends”. Kubernetes is essentially a container scheduler at its most basic and fundamental level. The ‘Pod’ is the unit of scheduling, containers in the same pod share the same life cycle and always run on the same node. For stateless pods life is reasonably simple and straight forward, for state-full pods, life is a bit more nuanced. If for any reason a node fails, the pods that ran on that node have to be rescheduled to run on a working node, and their storage needs to follow them. This involves un-mounting the volume from the failed node and then mounting it on the node the pod(s) are rescheduled to run on. With basic vanilla hyper-converged storage, i.e. storage and compute in the same chassis, this will ultimately lead to scheduling problems. However, software defined solutions exist that enable this kind of infrastructure to be turned into a storage cluster which allows state to follow pods around the cluster. Some people automatically associated HDFS with local storage, the reason for this is probably because “Back in the day”, the most cost efficient way for Google to scale out its infrastructure was via commodity servers with local disks.

Read the whole thing.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts