Press "Enter" to skip to content

Day: August 1, 2018

Joining Streams Of Data

Chuck Blake gives an example of joining two streams of data together in Wallaroo:

The joining event streams pattern takes multiple data pipelines and joins them to produce a new signal message that can be acted upon by a later process.

This pattern can is used in a variety of use cases. Here are a few examples:

  • Merging data for an individual across a variety of social media accounts.

  • Merging click data from a variety of devices (e.g. mobile and desktop) for an individual user.

  • Tracking locations of delivery vehicles and assets that need to be delivered.

  • Monitoring electronic trading activity for clients on a variety of trading venues.

Conceptually, it’s very similar to normal join operations, but there is a time element which complicates things.

Comments closed

Confluent Platform 5.0 Released

Raj Jain and Michael Noll walk through the latest version of Confluent Platform, Confluent’s Kafka solution:

With Confluent Platform 5.0, operators can secure infrastructure using the new, easy-to-use LDAP authorizer plugin and can deliver faster disaster recovery (DR) thanks to automatic offset translation in Confluent Replicator. In Confluent Control Center, operators can now view broker configurations and inspect consumer lag to ensure that they are getting the most out of Kafka and that applications are performing as expected.

We have also introduced advanced capabilities for developers. In Confluent Control Center, developers can now better understand the data in Kafka topics due to the new topic inspection feature and Confluent Schema Registry integration. Control Center presents a new graphical user interface (GUI) for writing KSQL, making stream processing more effortless and intuitive as well. The latest version of KSQL itself introduces exciting additions, such as support for nested data, user-defined functions (UDFs), new types of joins and an enhanced REST API. Furthermore, Confluent Platform 5.0 includes the new Confluent MQTT Proxy for easier Internet of Things (IoT) integration with Kafka. The latest release is built on Apache Kafka 2.0, which features several new functionalities and performance improvements.

Looks like there have been some nice incremental improvements here.

Comments closed

Testing BI Projects With NBi: Hierarchies And Levels

Cedric Charlier shows us a potential pain point when testing an Analysis Services cube with hierarchies defined:

When executing the following query (on the Adventure Works 2012 sample database/cube), you’ll see two columns in the result displayed by SSMS. It’s probably what you’re expecting, you’re only selecting one specific level of the hierarchy [Date].[Calendar Date] and one measure.

You’re probably expecting that NBi will also consider two columns. Unfortunately, it’s not the case: NBi will consider 4 columns! What are the additional and unexpected columns? The [Date].[Calendar].[Calendar Year] and [Date].[Calendar].[Calendar Semester] are also returned. In reality, this is not something specific to NBi, it’s just the standard behaviour of the ADOMD library and SSMS is cheating when only displaying one column for the level!

Click through for the solution.  And if NBi sounds interesting, check out Cedric’s prior post on the topic.

Comments closed

Configuring SQL Operations Studio

Kendra Little gave a quiz on SQL Operations Studio and is now sharing the answers:

Question 2: To toggle a BLOCK comment, the built-in shortcut is…

Answer: Shift+Alt+A

  • Correct: 43 (27%)
  • Incorrect: 118 (73%)

I think a lot of folks who use SSMS regularly and don’t use VSCode may not know what I meant by the question, because SSMS doesn’t have this functionality (or if it does, I’ve never figured out the shortcut!)

Check out all of the answers and build up those SQLOps skills.

Comments closed

Betteridge’s Law And Index Hints

Bert Wagner asks a question in his title, Should You Use Index Hints?  Those familiar with Betteridge’s Law of Headlines know the general answer already:

One way to “fix” a poor performing plan is to use an index hint.  While we normally have no control over how SQL Server retrieves the data we requested, an index hint forces the  query optimizer to use the index specified in the hint to retrieve the data (hence, it’s really more of a “command” than a “hint”).

Sometimes when I feel like I’m losing control I like using an index hint to show SQL Server who’s boss.  I occasionally will also use index hints when debugging poor performing queries because it allows me to confirm whether using an alternate index would improve performance without having to overhaul my code or change any other settings.

About the only place I consistently use index hints is with filtered indexes, where the combination of parameter sniffing and inexactitude in filters will convince the optimizer that the filtered index isn’t helpful when it really is.

Comments closed

Database Source Control With SVN

Nate Johnson sets up SVN for local source control:

I almost always have trouble remembering which option is for use with a non-empty folder of “here’s a bunch of files that I want to dump into the repo to start with”, vs. “here’s an empty folder where I want to pull down the contents of an existing repo”.  Fortunately, Tortoise yells at you if you try to do the latter — which is Export — into a non-empty folder.  So we want to Import.  Assuming you have a folder where all your SQL scripts live already, right-clicky and say “Tortoise SVN .. Import.”

Check it out.  The only concern I have is that this source control is just local source control.  That’s very helpful in situations where you accidentally mess something up, but my preference is to put my code in the same source control system the developers are using.  And if the developers aren’t using source control, get that institution in place as soon as possible because that’s begging for trouble.

Comments closed

Supertype-Subtype Relationships In Tables

Deborah Melkin explains the supertype-subtype pattern in relational database architecture:

You don’t see a supertype-subtype relationship defined as such when you’re looking at the physical database. You’ll only see it explicitly in the logical data model. So what is the pattern and how do you know that you have one in your database?

This relationship exists where you have one entity that could have different attributes based on a discriminator type. One example is a person. Depending on the role of that person in relationship to the business, you will need to store different pieces of information for them. You need different information about a client than you do an employee. But you’re dealing with a person so there is shared information.

It’s a good pattern for minimizing data repetition.

Comments closed