Curated SQL – Page 1032 – A Fine Slice Of SQL Server

Creating a UI in Powershell

Published 2020-01-24 by Kevin Feasel

Michael Berthold walks us through a useful example of using POSHGUI’s UI editor:

Some time back, a customer and I were working with the SentryOne PowerShell Module. Our PowerShell Module lets you manage the targets you are monitoring with SentryOne using a script or command line rather than the UI. This is a great time saver when you’re administering performance monitoring for hundreds or thousands of database servers.
The customer and I worked together to type up the commands they wanted for their script. They mentioned how it would be great if there were a GUI for this. This seemed odd initially, because the reason we were doing this in the first place was to automate these actions outside of a GUI. We spoke on it for a bit, and their meaning become clear. They envisioned a simple GUI used to guide in defining the commands for the PowerShell Module. I agreed that this would be helpful in getting a head start on scripting automation. I decided to find a way to fill this need.
This post explores one way to create a GUI using PowerShell. I’m using the SentryOne PowerShell Module for this example, but this method can be used for any PowerShell script.

Click through to see the example.

Comments closed

Concepts in Support Vector Machines

Published 2020-01-23 by Kevin Feasel

Abhijit Telang takes us through the calculations involved in Support Vector Machines and then gives us an example in R:

So, let’s take that out and we are back to old, classical vector algebra. It’s like a person with a bunch of sticks to figure out which one to lay where in a 2-D plane to separate one class of objects from another, provided class definitions are already known.
The problem is which particular shape and length must be chosen to show maximum contrast between classes.
We need to arrive at a function definition, in such a way that the value a given function takes changes drastically (e.g. from a large positive value to a large negative value).

SVM is often great for two-class classification problems, and different variants also work well for multi-class problems.

Comments closed

Log Aggregation with Apache Flink

Published 2020-01-23 by Kevin Feasel

Gyula Fora and Matyas Orhidi have started a series on log aggregation with Apache Flink:

There are several off-the-shelf solutions available on the market for log aggregation, which come with their own stack of components and operational difficulties. For example, notable logging frameworks that are widely used in the industry are ELK stack and Graylog.
Unfortunately, there is no clear cut solution that works for every application, and different logging solutions might be more suitable for certain use cases. The log processing of real-time applications should for instance also happen in real-time, otherwise, we lose timely information that may be required to successfully operate the system.
In this blog post, we dive deep into logging for real-time applications.

This post is mostly understanding and setup, but it leads into processing and visualization.

Comments closed

Migrating Oracle Exadata Workloads to Azure

Published 2020-01-23 by Kevin Feasel

Kellyn Pot’vin-Gorman shows the process of moving from an Exadata system to Oracle on Azure:

An Exadata is an engineered system- database nodes, secondary cell nodes, (also referred to as storage nodes/cell disks), InfiniBand for fast network connectivity between the nodes, specialized cache, along with software features such as Real Application Clusters, (RAC), hybrid columnar compression, (HCC), storage indexes, (indexes in memory) offloading technology that has logic built into it to move object scans and other intensive workloads to cell nodes from the primary database nodes. There are considerable other features, but understanding that Exadata is an ENGINEERED system, not a hardware solution is important and its both a blessing and a curse for those databases supported by one. The database engineer must understand both Exadata architecture and software along with database administration. There is an added tier of performance knowledge, monitoring and patching that is involved, including knowledge of the Cell CLI, the command line interface for the cell nodes. I could go on for hours on more details, but let’s get down to what is required when I am working on a project to migrate an Exadata to Azure.

Click through for the process.

Comments closed

ORDER BY Can Change the Query Plan

Published 2020-01-23 by Kevin Feasel

Erik Darling notes that adding an ORDER BY clause to a query can change the underlying query plan:

Sometimes I think it’s interesting how adding a seemingly useless or harmless thing to a query can change the query plan.
Here’s a quick example using an Order By on an indexed column.

Your mileage may vary on whether that’s a good thing.

Comments closed

Power BI: Visual has Exceeded the Available Resources

Published 2020-01-23 by Kevin Feasel

Chris Webb explains why you might see an error in Power BI:

This visual has exceeded the available resources. Try filtering to decrease the amount of data displayed.Please try again later or contact support. If you contact support, please provide these details.More details Resource Governing: The query exceeded the maximum memory allowed for queries executed in the current workload group (Requested 1048580KB, Limit 1048576KB).
The official Power BI documentation has similar advice to what’s shown in this dialog about what to do here, but what’s really going on?
The information in the “More details” section of the section dialog gives you a clue: in this case it’s resource governance. When you run a DAX query in Power BI it will always use a certain amount of memory; inefficient DAX calculations can cause a query to try to grab a lot of memory. In Power BI Desktop these queries may run successfully but be slow, but the Power BI Service can’t just let a query use as many resources as it wants (if it did, it may affect the performance of other queries being run by other users) so there is a resource governor that will kill queries that are too resource hungry. In the case of the visual above the query behind it tried to use more than 1GB of memory and was killed by the resource governor.

Read on to understand where these limits are and how you can modify them.

Comments closed

Indexes for Memory-Optimized Tables

Published 2020-01-23 by Kevin Feasel

Monica Rathbun takes us through the options available when creating indexes on memory-optimized tables:

Before we dive into this subject it is VERY important to note the biggest differences.
First, ALL memory optimized indexes MUST be created when the table is created or migrated. You cannot add indexes in an existing table without dropping and recreating the table.
Secondly, currently you can only have 8 indexes per table including your primary key. Remember that every table must have a primary key to enforce a secondary copy for a minimum of schema durability This means you can only really add 7 additional indexes so be sure to understand your workloads and plan indexing accordingly.

There are a few other differences as well, which Monica covers before detailing the specific index options.

Comments closed

Solving the Gaps and Islands Set of Problems

Published 2020-01-23 by Kevin Feasel

Ed Pollack continues a series on gap and island analysis:

Gaps and islands analysis supplies a mechanism to group data organically in ways that a standard GROUP BY cannot provide. Once we know how to perform an analysis and group data into islands, we can extend this into the realm of real data.
For all code examples in this article, we will use a set of baseball data that I’ve created and maintained over the years. This data is ideal for analytics as it is large and contains data quality that varies between very accurate and very sloppy. As a result, we are forced to consider data quality in our work, as well as scrutinize boundary conditions for correctness. This data will be used without much introduction as we will only reference two tables, and each is relatively straightforward.

The code in this article gets a bit complex, but Ed shows off some powerful techniques.

Comments closed

Copying Measure Definitions in Power BI

Published 2020-01-23 by Kevin Feasel

Erik Svensen takes us through an oddity in Power BI’s user interface:

Here is an idea you can vote for if you would find it useful as well – https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/13219620-duplicate-measure-and-format-copy
So we end up copying the formula from text in the formula bar
And click new measure and Paste it into the formula bar
But 8 of 10 times nothing is pasted (at least when I select) – WHY ???

This is a strange user experience. But regardless, I find it odd that you can’t copy a measure definition. If this is odd to you as well, upvote the Power BI suggestion.

Comments closed

Fraud Detection with Flink

Published 2020-01-22 by Kevin Feasel

Alexander Fedulov gives us a case study of using Apache Flink for fraud detection:

In this blog post, we have discussed the motivation behind supporting dynamic, runtime changes to a Flink application by looking at a sample use case – a Fraud Detection engine. We have described the overall architecture and interactions between its components as well as provided references for building and running a demo Fraud Detection application in a dockerized setup. We then showed the details of implementing a dynamic data partitioning pattern as the first underlying building block to enable flexible runtime configurations.
To remain focused on describing the core mechanics of the pattern, we kept the complexity of the DSL and the underlying rules engine to a minimum. Going forward, it is easy to imagine adding extensions such as allowing more sophisticated rule definitions, including filtering of certain events, logical rules chaining, and other more advanced functionality.

It was an interesting discussion and you can grab the code as well.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Curated SQL Posts