Curated SQL – Page 1080 – A Fine Slice Of SQL Server

SSMS Regular Expressions

Published 2019-12-19 by Kevin Feasel

Tim Mitchell looks at regular expressions in SQL Server Management Studio:

Regular expressions (or simply regex for short) have long been used by system administrators and data professionals for searching and manipulating text. Regular expressions allow the user to find, replace, and manipulate text based on the pattern they define in the expression. While every text editor allows simple search-and-replace capabilities, regex allows for searching for partial matches, using wildcards, and even integrating special characters (such as newlines and tabs) into the search or replacement text.
Regular expressions have been a part of SSMS for as long as I can remember, and make the process of pattern-based SQL code search much easier. In this tip, I’ll show you a couple of brief examples of the use of regular expressions for working with SQL code in Management Studio.

Regular expressions have been in the product for a long time, but the set of available regular expressions changed when SSMS moved over to the Visual Studio shell. And in some ways (particularly around capture groups), that was a change for the worse.

Comments closed

Making an Executable JAR from sbt

Published 2019-12-18 by Kevin Feasel

Sakshi Gawande walks us through two problems you might hit when building Scala projects into JARs:

Now, before going to see the solution we first understand why this problem occurs. When you uses Simple Build Tool Command ‘sbt package’, it creates a jar file that includes the class files from your source code and also the content from your src/main/resources folder.
But there are mainly two things which is important to execute jar file, are not included

Read on for these two things.

Comments closed

Jupyter Notebooks and Cosmos DB

Published 2019-12-18 by Kevin Feasel

Hasan Savran shows how we can use Jupyter notebooks with Cosmos DB:

After you enable the Notebook options, you are ready to analyze or visualize your data thanks to Python language and Python packages. Cosmos DB makes your life easy to write Python and install custom packages to use with your data. There are couple of great internal commands and wildcards you should know if you like to use Notebooks in Azure Cosmos DB. First one I want to introduce you is, %%sql command. This command lets you select data from your containers by using SQL API. You can select data and add it to your Python data frames. You need to define which database and container you want to use before you pass your query. Here is an example. In the following example, I want to use my database named Stackoverflow, and container named Posts. Then I pass the query.

These are internal notebooks, meaning no separate Jupyter server required. There’s a separate way of learning the Cosmos API from external notebooks.

Comments closed

Always Encrypted with Secure Enclaves

Published 2019-12-18 by Kevin Feasel

Ned Otter has started a new series on Always Encrypted with Secure Enclaves in SQL Server 2019:

SQL 2019 supports an enhanced version of Always Encrypted, known as “Secure Enclaves”. What is an enclave? It’s like a consulate: “….a state that is enclosed within the territory of another state”.
It takes the form of a protected region of memory within the SQL Server environment, requiring special credentials for access. Data in the secure enclave lives in an unencrypted state.
However, as I’ll discuss later in this series, depending on how your organization implements Always Encrypted with Secure Enclaves, it might not be as secure as you had hoped.

That’s pretty ominous. The first part is a fairly high-level overview which gets you familiar with enclaves.

Comments closed

Why So Few Columnstore Indexes Around?

Published 2019-12-18 by Kevin Feasel

Grant Fritchey has a bit of a rant about people not using Columnstore indexes as much as they should:

It was already common knowledge that columnstore indexes didn’t work for most of us.
Fact is, that’s not true. Now that we have clustered columnstore and non-clustered columnstore, you can go nuts. Most of your data access is through analytical channels? Awesome, use a clustered columnstore. Sometimes though, you need point lookups. Not a problem, add a nonclustered b-tree index to the clustered columnstore. Go here to learn more about Columnstore Indexes.
In short, today, we can completely orient our data storage with our principal data access. Yet, most people are not using these things at all.

One of my interview questions is about columnstore indexes. I’ve learned that I needed to preface it with “What’s the latest version of SQL Server you’ve worked with?” A lot of people answer 2012. Even among the people who use 2016, the normal answer is that they haven’t learned about columnstore yet. And that goes back to Grant’s learning gap: it’s not that hard to grab a book on SQL Server 2019, spin up a Docker container, and dive in. Or watch a course, spin up a Docker container, and follow along. Or read a blog post, spin up a Docker container, and…well, you get the idea.

Comments closed

When Missing Index Requests Go Missing

Published 2019-12-18 by Kevin Feasel

Erik Darling doesn’t want you to put missing index requests functionality on the back of a milk carton:

Rebuilding indexes will clear missing index requests. So before you go Hi-Ho-Silver-Away rebuilding every index the second an iota of fragmentation sneaks in, think about the information you’re clearing out every time you do that.
You may also want to think about Why Defragmenting Your Indexes Isn’t Helping, anyway. Unless you’re using Columnstore.

There are several reasons why you might not get a missing index request and Erik enumerates them for us.

Comments closed

Data Copy & Package Execution in ADF

Published 2019-12-18 by Kevin Feasel

Cathrine Wilhelmsen continues a series on Azure Data Factory. First, we get to see how to copy data from on-prem SQL Servers:

In the previous post, we looked at the three different types of integration runtimes. In this post, we will first create a self-hosted integration runtime. Then, we will create a new linked service and dataset using the self-hosted integration runtime. Finally, we will look at some common techniques and design patterns for copying data from and into an on-premises SQL Server.
And when I say “on-premises”, I really mean “in a private network”. It can either be a SQL Server on-premises on a physical server, or “on-premises” in a virtual machine.

Then, we learn how to run SSIS packages in Azure Data Factory:

Two posts ago, we looked at the three types of integration runtimes and created an Azure integration runtime. In the previous post, we created a self-hosted integration runtime for copying SQL Server data. In this post, we will complete the integration runtime part of the series. We will look at what SSIS Lift and Shift is, how to create an Azure-SSIS integration runtime, and how you can start executing SSIS packages in Azure Data Factory.

I’m going to guess that the next post will be all about the third integration runtime.

Comments closed

Listing M Functions in Power Query

Published 2019-12-18 by Kevin Feasel

Cecilia Brusatori shows how you can get a list of M functions available in Power Query:

No need to go somewhere else to get the M functions documentation, you can access all that from your Power Query Editor.
You’d be able to see the function definition, parameters with descriptions, usage and output. Handy tip!

Click through for the step-by-step directions.

Comments closed

Problems with sp_estimate_data_compression_savings

Published 2019-12-18 by Kevin Feasel

Andy Mallon knows it’s getting close to Festivus and he has some grievances to air:

If you’re working with compressed indexes, SQL Server provides a system stored procedure to help test the space savings of implementing data compression: sp_estimate_data_compression_savings. Starting in SQL Server 2019, it can even be used to estimate savings with columnstore.
I really don’t like sp_estimate_data_compression_savings. In fact, I kind of hate it. It’s not always very accurate–and even when it is accurate, the results can be misleading. Before I get ranty about why I don’t like it, let’s look at it in action.

Andy makes good points in this, so check it out.

Comments closed

Schiphol Takeoff: Low-Code Automated Deployment

Published 2019-12-17 by Kevin Feasel

Tim van Cann and Daniel van der Ende have an open source project for automatic deployment on Azure:

To give a bit more insight into why we built Schiphol Takeoff, it’s good to take a look at an example use case. This use case ties a number of components together:
– Data arrives in a (near) real-time stream on an Azure Eventhub.
– A Spark job running on Databricks consumes this data from Eventhub, processes the data, and outputs predictions.
– A REST API is running on Azure Kubernetes Service, which exposes the predictions made by the Spark job.
Conceptually, this is not a very complex setup. However, there are quite a few components involved:
– Azure Eventhub
– Azure Databricks
– Azure Kubernetes Service
Each of these individually has some form of automation, but there is no unified way of coordinating and orchestrating deployment of the code to all at the same time. If, for example, you were to change the name of the consumer group for Azure Eventhub, you could script that. However, you’d also need to manually update your Spark job running on Databricks to ensure it could still consume the data.

This looks pretty nice. I’ll need to dive into it some more.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Curated SQL Posts