Press "Enter" to skip to content

Curated SQL Posts

Resolving Collation Conflicts

Steve Stedman fights with collations:

One possible cause of the “Cannot resolve the collation conflict” error message is that your database collation doesn’t match the TempDB Collation.

[…]

Msg 468, Level 16, State 9, Line 11

Cannot resolve the collation conflict between “SQL_Latin1_General_CP437_CI_AS” and “Latin1_General_CI_AS” in the equal to operation.

Read on to see an example of how you can cause this problem and what you can do about it. I agree with Steve’s general advice to keep collations the same across an instance if you can.

Comments closed

All about Synchronous Stats Updates

Paul Randal shares some thoughts about synchronous stats updates:

The SQL Server query optimizer makes use of statistics during query compilation to help determine the optimal query plan. By default, if the optimizer notices a statistic is out-of-date because of too many changes to a table, it will update the statistic immediately before query compilation can continue (only the statistics it needs, not all the statistics for the table).

Note that “too many” is non-specific because it varies by version and whether trace flag 2371 is enabled – see the AUTO_UPDATE_STATISTICS section of this page for details.

Read on to learn more, including the problems that synchronous stats updates can cause, what you can do to avoid them, and ways you can tell that synchronous stats updates are a problem in your environment.

Comments closed

Power BI Push Datasets and Real-Time Dashboards

Marco Russo and Alberto Ferrari don’t have time to wait:

How many times have you heard an executive request a panel with the company’s sales data in real time? How frequently has this single request – which is more often a preference than an important business requirement – affected the overall architecture of your analytical solution?

In the Power BI world, requirements for real time often drive the creation of a pure DirectQuery model, with no aggregations to avoid data latency. This choice is incredibly expensive: the computational cost of each individual query is borne by the data source, which is often a relational database like SQL Server. On top of its cost, with this approach you will face scalability, performance, and modeling issues. Indeed, the relational database on top of which DirectQuery runs is mostly designed for transactional processing instead of being optimized for the workload of analytical processes. Optimizing the model is both difficult and expensive. Finally, using DirectQuery creates specific modeling constraints and the need for modeling workarounds to obtain good performance.

Creating an entire model using DirectQuery for the sole purpose of achieving a few real-time dashboards is definitely excessive. The primary scenario where relying on DirectQuery makes sense is when it is not feasible to import data quickly enough to satisfy the latency requirements for the majority of the reports. When the entire model can be in import mode, and a small number of dashboards require DirectQuery, there are better options available.

Definitely worth the read.

Comments closed

Page Allocation Reports in SSMS

Eitan Blumin has updated an open source project:

Back in April 2020, I created an open-source project called “SQL Server Page Allocation Reports“. It consisted of a set of SQL queries and some Power BI reports that can be used for visualizing the size and locations of your data and transaction log pages.

Well, recently I also added SSMS Custom Reports into the mix. So, it’s time to revisit this project and see what’s new!

Click through to see what’s new.

Comments closed

Attorneys General Anti-Trust Suit against Google

This isn’t entirely data-related, but it’s fascinating to read the claims against Google. Here is the full text of the suit (with mild redactions). @PatrickMcGee_ dives into the details on Twitter (link goes to ThreadReaderApp), as does @fasterthanlime.

It’s important to keep in mind that these are allegations not yet proved, and Google’s lawyers will get their chance to respond. But, because Google’s not giving me any money to shill for them, I will say that this looks bad for them. Assuming these allegations are close to accurate, there’s some pretty blatant abuse of monopoly power.

Comments closed

Azure Synapse Analytics October 2021 Update

Saveen Reddy summarizes the newest updates in Azure Synapse Analytics:

Use Stringify in data flows to easily transform complex data types to strings

Mapping data flows helps you perform code-free data transformation your Synapse pipelines. When you work with complex data types such as structures, arrays, map, you need to transform them into strings. You can do this by using the new Stringify data transformation simplifying this common task.

Read on for the full set of updates.

Comments closed

Enabling Statistics Auto-Creation

Chad Callihan checks the stats:

When we query for data, we don’t always think about the magic that goes into efficiently returning results. One vital piece to this magic is statistics. Statistics in SQL Server are histograms that are used by the query optimizer to determine an optimal execution plan when executing a query. Let’s take a look at the different ways to check your statistics settings and make sure statistics are being automatically created.

Click through to see how.

Comments closed

Comparing OPENJSON vs Other List-Parsing Methods in SQL Server

Aaron Bertrand splits a string:

I have long advocated avoiding splitting strings by using table-valued parameters (TVPs), but this is not always a valid option; for example, PHP drivers do not handle this feature yet. A new pattern I’ve seen emerging is to replace splitting comma-separated strings with the new OPENJSON functionality introduced in SQL Server 2016. I wanted to explore why this is not an improvement in the typical case, unless you are using a client application platform that doesn’t support TVPs and your application data already happens to be in JSON format.

Read on for the solution, which has a somewhat surprising outcome.

Comments closed

Four DBA ToDos in a New Role

Lee Markum has a starting point for DBAs in a new role:

You’ve just been hired into a DBA role at a new company, or you’ve been given the DBA keys at your current company. Maybe you’re a SysAdmin and your boss has informed you that you are now supposed to manage the SQL Servers as well as everything else on your plate. In any of these situations, you may have some confidence in your skills, but especially in the case of being a new hire, you have absolutely no true idea of what you’re walking into.

In these scenarios, where do you start? Start with these four areas.

Click through for the four areas. I completely agree with Lee on these for DBAs, including the order.

Comments closed

Grouping on a Substring

Kevin Hill tries something out:

I was chatting with Jeff (b|t) on my team yesterday and the context escapes me but I had this thought:

“Can you Group By the beginning characters, or a subset, of a field?”

I’m not a developer, so this question never comes my way. Except yesterday.

Click through for the answer. Mind you, this is unlikely to perform well, but if you’re looking at a small enough dataset or need a one-time query and can afford a full scan of the data, so be it.

Comments closed