Press "Enter" to skip to content

Day: April 16, 2021

Check Those Feature Distributions

Antoine Rebecq shares a warning:

I was recently working on a cool dataset that looked unusually friendly. It was tidy, neat, interesting… the kind of things that you rarely encounter in the wild! My goal was to build a super simple predictor for one of the features. However, I kept getting poor results and at first couldn’t figure out what was happening.

There’s some good, practical advice in there, so check it out. H/T R-Bloggers

Comments closed

Conditional Expressions in Elasticsearch

The Hadoop in Real World team explains how to perform OR, AND, and NOT operations in Elasticsearch queries:

We can specify conditional expressions like OR, AND using the Query expression during search in Elasticsearch.

We have an index named account and in the index we have details of account owners including their name, address, age, sex, employer etc.

Let’s search the documents with AGE=25 and STATE IN (‘ca’, ‘ny’) in the index.

As a spoiler, it’s not as easy as using OR, AND, and NOT, though there are synonyms.

Comments closed

Automating Workflows in Azure DevOps with Logic Apps

Elie Bou Issa does some no-code automation:

Azure Logic Apps is a cloud service to help you schedule, automate, and orchestrate tasks and workflows between apps and across enterprises and organizations. A Logic App can be built using the Azure portal, or infrastructure as code.

By the end of this article, you will have a good understanding of leveraging a Logic App for Azure DevOps to automate the create of work items, in addition to creating an automated approval-based workflow using Office 365.

Click through for the demo. This is useful on its own, especially with non-technical product managers, but you can extend the use of Logic Apps quite a bit and automate more work without writing much code.

Comments closed

Using AD Authentication on Linux when Connecting to SQL Server

Daniel Hutmacher shares some hard-earned wisdom:

I’m a complete beginner at Linux, so I should preface this post with the fact that these are my humble notes after hours of pulling my hair. It’s not really a fully-fledged how-to article, and there are lot of things I’m not covering. But I figured it may help someone out there at some point.

Also, different Linux distros and versions will behave differently, so your mileage will most likely vary.

For the purposes of this post, I’m on Red Hat Enterprise 8.3.

Note that this is using a Linux-based client, rather than talking about SQL Server on Linux.

Comments closed

Inserted and Deleted Scans

Hugo Kornelis has a two-parter for us. First up is the inserted scan operator:

The Inserted Scan operator is only found in execution plans for code in triggers. It is used for queries that read data from the inserted pseudo-table. Its counterpart, Deleted Scan, reads from the deleted pseudo-table.

This pseudo-table contains a copy of all the rows that were inserted in AFTER INSERT triggers, or the new content of the data in all affected rows in AFTER UPDATE triggers. In INSTEAD OF INSERT or INSTEAD OF UPDATE triggers, the data in the inserted pseudo-table is the data that would have been inserted, or the data as it would have been after the update. In AFTER DELETE and INSTEAD OF DELETE triggers, using the inserted pseudo-table is allowed but returns no data.

Then, the deleted scan:

This pseudo-table contains a copy of all rows that has just been deleted in AFTER DELETE triggers, or the original data of all affected rows in AFTER UPDATE triggers. In INSTEAD OF DELETE or INSTEAD OF UPDATE triggers, the data in the deleted pseudo-table is the current data in the rows that would have been deleted or updated. In AFTER INSERT and INSTEAD OF INSERT triggers, using the deleted pseudo-table is allowed but returns no data.

Click through to see how they work.

Comments closed

Index Unions

Erik Darling continues a multi-state indexing spree:

Index union is a little bit different from index intersection. Rather than joining two indexes together, their result sets are concatenated together.

Just like you’d see if you wrote a query with union or union all. Crazy, huh?

As with index intersection, the optimizer has a choice between concatenation and merge join concatenation, and lookups back to the clustered index are possible.

These I see even less commonly than index intersections—so often, the optimizer decides simply to scan one index and the solution is to break the queries out into two with UNION ALL.

Comments closed