Press "Enter" to skip to content

Curated SQL Posts

Search Optimization in Snowflake

Arun Sirpal doesn’t have time to create indexes:

I will use a clone of the table to compare it to when search optimisation is on. I will make sure no caching in on which could affect the test.
I activate the feature via:

ALTER TABLE data_staging ADD SEARCH OPTIMIZATION;

This takes time! If you run something like the below to confirm 100% completion. This is because there is a maintenance service that runs in the background responsible for creating and maintaining the search access path:

Click through to see what happens and the kinds of performance gains Arun realized.

Comments closed

Feature Branching for Database Projects

Olivier Van Steenlandt describes one branching strategy and applies it to database development:

Depending on how you have defined your branching strategy, you will start development differently. Below I’m defining a few different branching strategies:

1. No branching

2. Branching/environment

3. Branching/feature

4. …

In the past, I have used all of the above. I need to tell you that the Branching/feature strategy allows me to be the most flexible for database development. Why? Let’s dive into this method for now:

Read on to learn more.

Comments closed

Approximate Percentiles in SQL DB and SQL MI

Balmukund Lakhani has an announcement:

Approximate query processing was introduced to enable operations across large data sets where responsiveness is more critical than absolute precision. Approximate operations can be used effectively for scenarios such as KPI and telemetry dashboards, data science exploration, anomaly detection, and big data analysis and visualization. Approximate query processing family has enabled a new market of big data HTAP customer scenarios, including fast-performing dashboard and data science exploration requirements.  

Today, we are announcing preview of native implementation of APPROX_PERCENTILE in Azure SQL Database and Azure SQL Managed Instance. This function will calculate the approximated value at a provided percentile from a distribution of numeric values.

This is way faster than using the PERCENTILE_CONT() or PERCENTILE_DISC() window functions. For a decent-sized query, I was getting anywhere from 5-20x performance improvements, and the larger the dataset, the bigger the gains. It is important to note that the approximate percentiles are not window functions, so you don’t get one row back per row of input.

Comments closed

Improving Power BI Q&A with Synonyms

Patrick LeBlanc pulls out the thesaurus:

Most struggle with getting Q&A to be effective in Power BI. Usually this comes down to either model naming or synonyms. Patrick shows you how you can update these and also a nice feature to let you share them with others.

I’ve found the Power BI Q&A component to be a bit tetchy, even with synonyms, when you’re asking for non-trivial slices of the data. Still, what Patrick shows does help a lot.

Comments closed

Storage Snapshots and In-Memory OLTP

Andy Yun answers a question:

Can I still take storage-array snapshots and if yes, will I lose data in my memory-optimized tables? What about data inside my non-durable tables?

Thankfully, the question was not in the headline. Therefore, Betteridge’s Law of Headlines does not apply and the answer may be either ‘yes’ or ‘no’ depending on the facts. Speaking of which, to find that answer, click through and read Andy’s post.

Comments closed

Power BI Bookmarks and Grouping Visuals

Mara Pereira shares a tip:

Probably one of the most annoying things about creating bookmarks is… updating them!

However, there is a trick that will make your life so much easier and the process a lot faster.

And the trick is…

GROUP YOUR VISUALS and always select the option “SELECTED VISUALS” when creating your bookmark

Read on to see how it works.

Comments closed

Editing Camera Image Metadata with R

Neil Saunders has some trail cameras:

The camera model I chose is the Campark T85 which for me, had the right combination of features and price point. One useful feature is the ability to transfer images and video to a phone wirelessly (albeit through a rather clunky phone app). Unfortunately, images retrieved in this way have one major flaw: an almost-complete absence of metadata. There is no GPS in the camera of course, but the EXIF data does not include the date/time of the image, nor the camera make.

With a little research, I found a way to add this information to the images later using R and some additional software named exiftool. Here’s how I did it.

Read on to see how Neil solved this issue with a bit of R.

Comments closed

Good Practices for Power BI Development

Reza Rad shares some thoughts with us:

DAX is the language of writing calculations in Power BI. We use DAX to write calculations such as year-over-year change and percentage, or percentage of the total or rank of customers by their yearly revenue. Writing calculations in DAX takes time, and you may likely need to re-use a calculation in multiple reports.

Creating copies of the PBIX file every time for reusing the calculation is not ideal. The better approach is to have a shared dataset created by DAX calculations and then create thin reports with live connections to the shared Power BI dataset. Using a shared dataset ensures that all the reports are using the same DAX calculations. If a change is needed, it is only needed in the shared dataset. Maintaining a solution like this would be much easier.

Click through for a dozen or so recommendations.

Comments closed

Monitoring the Serverless SQL Pool via Log Analytics

Sidney Cirqueira shows how to monitor SQL requests in Azure Synapse Analytics:

Today I would like to share a scenario that I was working on one of my serverless SQL Pool support cases. The customer asked for an advice on how to monitor serverless SQL requests by using log analytics.

The intention of this guide is to help you with choosing the configuration required to easily setup the Synapse Analytics Workspace monitoring and all other considerations about how to monitor serverless SQL requests with Azure Monitor. Spoiler: At the end of this article, I will share the latest version of the serverless workbook posted on the Azure_Synapse_Tool_Box. This includes a really cool way to see query execution information.

Read on for that and definitely check out the Azure Synapse Toolbox if you’re a Synapse user.

Comments closed

Find When a Table was Dropped

Andrea Allred does some sleuthing:

Say you have a user come to you and they dropped a table sometime yesterday, but they don’t remember when and now they need it back. You could start the restore process and roll through logs until you see the drop and then restore to the hour before or you could run this super cool query to get the time the table was dropped.

Click through to see when and even which user did it—assuming you don’t have everybody going through a connection pooled account.

Comments closed