Press "Enter" to skip to content

Day: July 5, 2022

Building a “Solution Summary” Card for Developers

Ben Brown gets the new developers up to speed:

Solution Summary cards are the name I have given to simple 1-page documents we provide to developers to get up-to-speed on a particular solution.

The primary use-case for these cards is to provide better on-boarding for new team members.

An additional benefit is that crafting these short summaries forces people to review how your teams are working and you will often start seeing areas which would benefit from standardization or simplification.   

Click through for an example. I like this idea a lot for a few reasons, which I’ll enumerate here because I’m feeling it:

  1. Developers often won’t read really long documents. They might scan through the documents, Ctrl+F a few words, and refer back to it on occasion but rarely will they read through a document in its entirety.
  2. Long documents get out of date too quickly, as developers are typically bad at keeping documentation up to date, especially as deadlines loom.
  3. An artificial constraint like “Your summary must fit on one printed page” (either front-only or front-and-back) helps focus the mind. It makes you think about what the most important things to include are.
  4. If you are a consultant or just work on a lot of different projects, it can be painful trying to remember the details. Which jump box do you need to go to? Which were the most important bits of code? How do you run the unit tests and deploy the app locally? Even if nobody else ever sees the card, it can help you in the future.
Comments closed

Summarizing Data & AI Summit Announcements

Zach Stagers hits the high notes:

One of the biggest cheers of the keynote was that Delta is being fully open sourced! Databricks continue to share their incredible work to help drive our industry forward. Delta already has wide adoption, but with the open sourced version now being levelled up to the same standard as the ‘proprietary’ one, this should help cement it as the default choice for lake-based storage.

There were some announcements of things to come with Delta too, such as a optimised deletes and updates by removing single rows instead of having to completely rewrite the file. It’ll be really interesting to see how this works, and just how much it boosts performance.

Read on for more notes on several big announcements.

Comments closed

Databases, Applications, and Source Control Repos

Eitan Blumin asks and answers a question:

Following the rise in popularity of DevOps for Databases, many interesting questions are being asked on the topic.

One of these questions is: Should your SQL Database project be in the same source control repository and solution as the App code project? Or maybe they should be in the same repository but separate solutions? Or maybe they should be in completely separate repositories?

Pre-registering my answer here: for most organizations, databases should be in a separate repository. The deployment cadence is different, the deployment mechanism is different, and the people working on each likely differ. Read on for Eitan’s thoughts, which get into more of the nuance behind the answer.

Comments closed

Azure Data Explorer Web Updates

Michal Bar has a few updates to the Azure Data Explorer web tool:

We are focused on continuously improving the results exploration experience in ADX web UI, to make it easy and intuitive. Our goal is to provide an easy-to-use UI so that you will not be required to re-write KQL queries in order to perform light-weight data exploration.

Click through to see how you can search and filter within the results pane (something I’d like to see in other Microsoft data platform tools like SSMS), create series panels on charts from KQL, and more.

Comments closed

Filling in GitHub Repo Details

Kevin Chant practices GitHub hygeine:

To clarify, GitHub hygiene is a term that I use to describe the practice of keeping GitHub repositories healthy.

Some of you have probably noticed I have been doing this more recently. With this in mind, I thought I would share what I have been doing in this post for a couple of reasons.

First of all, to help raise awareness about some of the best practices I have been doing.

Secondly, because I am interested to get feedback from other members of the Microsoft Data Platform community about this. For example, do you also follow the same practices?

This is a reminder that there’s a lot you can include in a GitHub repo aside from the code itself.

Comments closed

More Notes on Parameter Sensitive Plan Optimization

Erik Darling continues a series on Parameter Sensitive Plan Optimization. First up is a missed opportunity:

If we re-run the procedure from the section up above to search for VoteTypeId 2 a second time, Memory Grant Feedback will fix the spill at the Hash Join, and bring the total execution time down to about 15 seconds.

That is an improvement, but… Look at the plan here. If VoteTypeId 2 uses a plan more suited to the number of rows it has to process, the overall time is around 4 seconds, with no need for a memory grant correction.

Second, Erik asks the pressing questions:

Under more favorable circumstances, dynamic SQL gets run, executed, and plans cached and reused with the same frequency as stored procedures.

Now, dynamic SQL isn’t exactly the same as stored procedures. There’s a lot you can do with those that just looks a mess in dynamic SQL, especially longer bits of code.

In today’s post, we’re going to look at how the Parameter Sensitive Plan (PSP) optimization works with dynamic SQL.

Read on to see if PSP works with dynamic SQL.

Comments closed

Tools for a Jump Box

Tracy Boggiano looks in the tool bag:

Having recently taken a new job and introducing a number of new tools to my new coworker I thought I’d share how I setup my jump box to and keep it updated so others can benefit, and I can find it later (I did put this in our internal Confluence pages, but I do have a box in Azure for presentations). In alphabetical order because that’s the only way to make sense of things.  I’d be curious if anybody has anything they use that I should add, so please leave comments.

Read on to see the list, including all of the Azure Data Studio and Visual Studio Code extensions Tracy likes to use.

Comments closed

SET, SELECT, and Variable Assignment

Chad Callihan obliquely reminds us to create those unique constraints (by way of unique indexes):

Did you know there is more than one way to set a variable in SQL Server? You can actually set a variable without using “SET” at all. Let’s look at an example that shows how setting a variable with SELECT can cause a headache when dealing with identical values.

Click through to see the problem in action. One way around this if you do know you are dealing with duplicates and need a specific one is to SELECT TOP(1) with an appropriate ORDER BY clause, just as you would if variable assignment weren’t on the table.

Comments closed