Press "Enter" to skip to content

Author: Kevin Feasel

Auto-Escaping XML Characters

Emanuele Meazzo shows how you can auto-escape XML characters using T-SQL:

Recently I had to look up the definition for a bunch of SQL objects and didn’t want to manually retrieve them manually in SSMS (with Create Scripts) or Visual Studio (by searching the object name in my TFS repository).

Since lazyness and automation are the basis of a well done engineering work, I wanted to create a list, where I could basically click on the object that I needed and see the definition right away, without any tool or having to code something externally, of course.

Click through for the solution, which is short and sweet.

Comments closed

Cancelling Resumable Index Creation

Brent Ozar takes us through a couple considerations when using online, resumable index creation:

In SSMS, you’re used to being able to click the “Cancel” button on your query, and having your work rolled back.

You’re also used to being able to kill a query, and have it automatically roll back.

Neither of those are true with resumable index creations. In both cases, whether you kill the index creation statement or just hit the Cancel button in SSMS to abort your request, your index creation statement is simply paused until you’re ready to come back to it. (Or, it’s ready to come back to haunt you, as we saw above.)

There are some good things to keep in mind here.

Comments closed

Finding Missing Values with Tally Tables

David Fowler shows one way to find missing values using a tally table:

This is going to be a bit of a brain storming post that comes from an interesting question that I was asked today…

“I’ve got a table with a ID code field, now some of the rows have a value in that field and some are NULL. How can I go about filling in those NULL values with a valid code but at the same time avoid introducing duplicates?”

Click through for David’s solution.

Comments closed

Databricks Dashboards

Megan Quinn takes us through building dashboards with Apache Zeppelin on Databricks:

The first step in any type of analysis is to understand the dataset itself. A Databricks dashboard can provide a concise format in which to present relevant information about the data to clients, as well as a quick reference for analysts when returning to a project.

To create this dashboard, a user can simply switch to Dashboard view instead of Code view under the View tab. The user can either click on an existing dashboard or create a new one. Creating a new dashboard will automatically display any of the visualizations present in the notebook. Customization of the dashboard is easily achieved by clicking on the chart icon in the top right corner of the desired command cells to add new elements.

This isn’t quite a step-by-step guide but does spur on ideas.

Comments closed

Fun with Emoji in SSMS

Solomon Rutzky shares a method to generate any Unicode character in SQL Server Management Studio:

I used to go to the Emoticons (Emoji) 1F600—1F64F page of unicode-table.com to copy and paste characters, code points, or check the encoding chart at the bottom of each character page (the “hex” column of both “UTF-16BE” and “UTF-16LE” rows have proven most useful).
But not anymore. Now, I just hit:   Ctrl + 0.

When I do that, I get a list of 188,657 code points. Each row contains the official code point value (“U+HHHH”), the integer value, the hex value (“0xHHHH”), the character itself, the UTF-16 Little Endian byte sequence (how it is actually stored, and what you get if you convert an NVARCHAR value to VARBINARY), the surrogate pair values, the T-SQL notation (which does not require using an _SC or _140_ collation), the HTML notation (“&#xHHHH;”), and finally the C-style notation (“\xHHHH” ; used for C / C++ / C# / Java / etc). I can copy and paste any of those values and use them in queries, emails, blog posts, .NET code, and so on.

Click through to see how Solomon does this.

Comments closed

Working with Columns in Spark

Achilleus has a two-parter on working with columns in Spark. Part 1 covers some of the basic syntax and several functions:

Also, we can have typed columns which is basically a column with an expression encoder specified for the expected input and return type.

scala> val name = $"name".as[String]
name: org.apache.spark.sql.TypedColumn[Any,String] = name
scala> val name = $"name"
name: org.apache.spark.sql.ColumnName = name

There are more than 50 methods(67 the last time I counted ) that can be used for transformations on the column object. We will be covering some of the important methods that are generally used.

Part 2 covers other functions including window functions:

17) over
This is one of the most important function that is used in many of the window operations.We can talk about the window function in detail when discuss about aggregation in spark but for now, it will be fair enough to say that over method provides a way to apply an aggregation over a window specification which in turn can be used to specify partition, order and frame boundaries of the aggregation.

Check out both of these posts for useful tidbits.

Comments closed

Creating Threadpools with ExecutorService in Kafka

Prasanth Nair shows how we can use Java’s ExecutorService to create threadpools for Kafka consumers:

Apache Kafka is one of today’s most commonly used event streaming platforms. While using the Kafka platform, quite often, we run into a scenario where we have to process a large number of events/messages that are placed on a broker. Traditional approaches, where a consumer is listening to a topic and then processes these message within the consumer itself, can become a performance bottleneck if the number of messages being placed on the topic is high. In such cases, the rate at which a consumer can process messages will be very low, as there are a large number of messages getting placed on the topic. A potential solution that can be applied in such a scenario is to offload message processing to the worker threads in a thread pool.

Click through for the Java code.

Comments closed

Conditional Formatting on Text Fields in Power BI

Matt Allington shows how you can apply conditional formatting to non-numeric fields in Power BI:

The high level process is to:
1. Create a measure that returns a colour as the result

1. It can be a word, such as blue, red, green
2. It can be a hex code for a colour, like #40E0D0″, “#FFA07A”
2. Use conditional formatting and use the measure to apply the formatting on the text as a rule.

Read on for a demo.

Comments closed

Defending ANSI-89 Syntax

Allan Hirt prefers ANSI-89 join syntax:

Pedro Lopes who is now on the SQL Server dev team wrote the blog post “T-SQL Misconceptions – JOIN ON vs. WHERE” which does a good job of explaining the “it depends” around the newer style of joins. Why link this article? Microsoft deprecated the OUTER JOIN operators in SQL Server 2008. Two other sources of information on this are here and here (the first is Ward Pond’s old technet blog, and sadly will probably go away soon). If you’re keeping score at home, WHERE clauses are not deprecated except if you’re using *= and =*). The changes people made were wholly unnecessary and as the author, the newer stuff is harder to decipher than what I originally did. They were putting their own biases onto things.

I personally do not like ANSI-89 syntax because it blurs the lines between filters and join criteria. Those are separate things serving different purposes and keeping them separate clarifies queries more than it obscures. Allan’s example doesn’t have any filters but in a more complex scenario with several filters and several join criteria, it can require extra care figuring out what’s going on, especially with multi-column join criteria and filters mixed in (meaning it’s not join criteria and then filters, but a mishmash of the two).

Comments closed

Query Store: Forcing Parallel Plans

Grant Fritchey shows us that you can force a parallel plan using Query Store:

I love the questions I get when presenting: Can You Force a Parallel Plan in Query Store. I haven’t a clue. The trick I think is going to be in setting up the test. Let’s try it out.

Reliably Getting a Parallel Plan
Because this is for testing, rather than try to build some crazy query that may or may not go parallel, I’ve decided to cheat. I’ll take advantage of a little functionality that ensures I see a parallel plan when I want to.

Read on to see Grant’s trick and also the results of this test.

Comments closed