Press "Enter" to skip to content

Month: October 2021

Attorneys General Anti-Trust Suit against Google

This isn’t entirely data-related, but it’s fascinating to read the claims against Google. Here is the full text of the suit (with mild redactions). @PatrickMcGee_ dives into the details on Twitter (link goes to ThreadReaderApp), as does @fasterthanlime.

It’s important to keep in mind that these are allegations not yet proved, and Google’s lawyers will get their chance to respond. But, because Google’s not giving me any money to shill for them, I will say that this looks bad for them. Assuming these allegations are close to accurate, there’s some pretty blatant abuse of monopoly power.

Leave a Comment

Azure Synapse Analytics October 2021 Update

Saveen Reddy summarizes the newest updates in Azure Synapse Analytics:

Use Stringify in data flows to easily transform complex data types to strings

Mapping data flows helps you perform code-free data transformation your Synapse pipelines. When you work with complex data types such as structures, arrays, map, you need to transform them into strings. You can do this by using the new Stringify data transformation simplifying this common task.

Read on for the full set of updates.

Leave a Comment

Enabling Statistics Auto-Creation

Chad Callihan checks the stats:

When we query for data, we don’t always think about the magic that goes into efficiently returning results. One vital piece to this magic is statistics. Statistics in SQL Server are histograms that are used by the query optimizer to determine an optimal execution plan when executing a query. Let’s take a look at the different ways to check your statistics settings and make sure statistics are being automatically created.

Click through to see how.

Leave a Comment

Comparing OPENJSON vs Other List-Parsing Methods in SQL Server

Aaron Bertrand splits a string:

I have long advocated avoiding splitting strings by using table-valued parameters (TVPs), but this is not always a valid option; for example, PHP drivers do not handle this feature yet. A new pattern I’ve seen emerging is to replace splitting comma-separated strings with the new OPENJSON functionality introduced in SQL Server 2016. I wanted to explore why this is not an improvement in the typical case, unless you are using a client application platform that doesn’t support TVPs and your application data already happens to be in JSON format.

Read on for the solution, which has a somewhat surprising outcome.

Leave a Comment

Four DBA ToDos in a New Role

Lee Markum has a starting point for DBAs in a new role:

You’ve just been hired into a DBA role at a new company, or you’ve been given the DBA keys at your current company. Maybe you’re a SysAdmin and your boss has informed you that you are now supposed to manage the SQL Servers as well as everything else on your plate. In any of these situations, you may have some confidence in your skills, but especially in the case of being a new hire, you have absolutely no true idea of what you’re walking into.

In these scenarios, where do you start? Start with these four areas.

Click through for the four areas. I completely agree with Lee on these for DBAs, including the order.

Leave a Comment

Grouping on a Substring

Kevin Hill tries something out:

I was chatting with Jeff (b|t) on my team yesterday and the context escapes me but I had this thought:

“Can you Group By the beginning characters, or a subset, of a field?”

I’m not a developer, so this question never comes my way. Except yesterday.

Click through for the answer. Mind you, this is unlikely to perform well, but if you’re looking at a small enough dataset or need a one-time query and can afford a full scan of the data, so be it.

Leave a Comment

Storage Testing for Azure SQL Managed Instances

Joe Obbish busts out the slide rule:

Lately I’ve been doing some exploratory performance testing on Azure SQL Managed Instances in preparation for a migration to that platform. This blog post documents some storage testing results and may even have practical advice near the end. All testing was done on a gen5 general purpose instance with 8 vCores.

Read on for Joe’s findings. Spoiler alert: there is practical advice at the end.

Leave a Comment

Real-Time Change Detection via Cumulative Sums

Nithin Sankar tracks deviations with cumulative sums:

With the advent of Internet of Things (IOT) and the proliferation of connected devices, comes the challenge of monitoring parts for maintenance before they break down. A common approach revolves around getting data from connected devices and performing a statistical test to determine the likelihood of the device failing. While this common approach is robust, it typically involves a significant time investment in exploratory data analysis, feature engineering, training, and testing to build a predictive model. It, therefore, often lacks the agility required to keep up with the monitoring demands of increasingly time-sensitive initiatives. 

In this context, the question becomes: how can we ensure a similar degree of rigor, but also improve the timeliness and responsiveness of being able to perform predictive maintenance? 

Click through for the process, as well as an example using Azure Stream Analytics and Power BI.

Leave a Comment