Press "Enter" to skip to content

Author: Kevin Feasel

Microservice Communication Patterns

John Hammink shares a few ways that you can have microservices communicate with one another and argues that Kafka is a great platform for microservice communication:

Simply put, microservices are a software development method where applications are structured as loosely coupled services. The services themselves are minimal atomic units which together, comprise the entire functionality of the entire app. Whereas in an SOA, a single component service may combine one or several functions, a microservice within an MSA does one thing — only one thing — and does it well.

Microservices can be thought of as minimal units of functionality, can be deployed independently, are reusable, and communicate with each other via various network protocols like HTTP (More on that in a moment).

Read the whole thing. I have a love-hate relationship with these but it’s a pattern worth understanding.

Comments closed

Power BI IntelliSense For Python and R

David Eldersveld makes me wonder about the value of Power BI’s IntelliSense for R and Python:

If I type the letter into the R Script editor, my code completion options are actsalwaysand, and as. Power BI’s editor is not offering any IntelliSense options from a Python or R dictionary. Instead, it’s pulling from the text already in the editor. Note the comment in Line 1 and the inclusion of words beginning with the letter a — always, and, acts, as.

By comparison, the DAX editor contains a detailed function list and helpful annotations for code completion. Can we get something similar for R and Python? Not exactly… But there’s a workaround that I’m almost embarrassed to suggest. If you are a user who codes directly into the script editor, the following hack could be helpful. If you use the option to Edit script in External IDE, keep doing that and ignore the following guidance.

As-is, this is worse than no IntelliSense because at least with no IntelliSense, it’ll never steal a mouse click or keystroke. I wouldn’t expect RStudio level quality out of the gate but unless I’m missing something, that’s pretty bad.

1 Comment

Copying Filestream Data Between Tables

Paul Randal takes us through some limitations on copying Filestream data between tables:

I was asked last week whether it’s possible to create a table with a FILESTREAM column and then populate that column by copying FILESTREAM files from another directory in the FILESTREAM data container.

The simple answer is no.

Paul explains why this isn’t possible and then gives you an alternative which does work.

Comments closed

Eye-Friendly Palettes

Shannon Holck has shared a Power BI theme using a color-safe and easy to view palette:

Edward Tufte recommended use of soft colors that do not tire the eyes.  I’ve actually never read his books (yet), but a former boss of mine was a devout disciple and produced some beautifully soft color palettes.

Stephen Few, in “Show Me the Numbers,” reiterated Tufte’s color theories and recommended three sets of hues:

Light – for large shapes, e.g. bars
Medium – for small shapes, e.g. points
Dark/Bright – for calling attention to data

Click through for more including where you can get this Power BI theme. I’m not exactly the world’s biggest fan of the default palette so I’ll have to check this one out.

Comments closed

Blaming the Right Cardinality Estimator

Arthur Daniels helps us figure out which of SQL Server’s cardinality estimators your query used:

SQL Server 2008 is reaching end of support this year, so upgrading your SQL Server might be on your mind. One of the big changes when you upgrade your SQL Servers is upgrading the compatibility level, which by default will upgrade the cardinality estimator (CE).

This can change query performance, for better or for worse. This post won’t focus on whether it’s good or bad, but instead I want to show you how you can check to see what CE was used by your queries.

It’s not a 100% guarantee, but generally I’ve found the new estimator to be superior.

Comments closed

Conditional Formatting in Power BI

Reza Rad shows us a few ways to perform conditional formatting in Power BI:

I have given many presentations and talks about Data Visualization, and still, I am amazed by how many visualizations I see which is not following the basic rules. In this article, I want to focus on table visual. A table is a visual that most of us are using it on many occasions, in fact, many users, like to see the data in table format. However, a table can be visualized in a way that is not readable. In this article, I’m showing you the most common style of a table which many report developers use, and then challenge it with a better style. The mystery is of course in conditional formatting. Like all my other articles, this article is demonstrating this technique in Power BI. If you like to learn more about Power BI, read Power BI book from Rookie to Rock Star.

Some of these formats are better than others, but you do have the power to do quite a bit with it in Power BI.

Comments closed

Creating Multi-Column Statistics From Missing Index DMVs

Max Vernon shows how you can use the missing index DMVs to find potential candidates for multi-column statistics:

SQL Server does have a fairly useful dynamic management view, or DMV, which provides insight that can be leveraged in this area. The DMV I’m talking about is the set of DMVs around missing indexes, consisting of sys.dm_db_missing_index_groupssys.dm_db_missing_index_details, etc. I’m not saying the missing indexes DMVs are a panacea that will enable you to fix every performance situation you run into, but they can be useful if you know where to look. This post doesn’t go into a lot of depth about how to use those DMVs for the purpose of actually creating indexes, however I will show you how you can create multi-column stats objects as an interim performance booster while evaluating the need for those indexes.

I’ve never had great luck with multi-column stats versus simply creating indexes but that could simply be a case of me doing it wrong.

1 Comment

Where Hadoop Is Going

Erik Krogen summarizes a recent Hadoop developer gathering at LinkedIn:

The day started with LinkedIn’s very own Jonathan Hung (left) and Anthony Hsu(right) discussing TensorFlow on YARN, or TonY, our home-grown and recently open-sourced solution for distributed deep learning via TensorFlow on top of YARN. They discussed its architecture and implementation, as well as future goals, such as support for additional runtimes like PyTorch. You can view their slides here and a recording of their presentation here.

Looks like there were several interesting talks and a lot of content showing where Hadoop will go over the next year or so.

Comments closed

Getting Started With Apache Flume

Mark Litwintschik takes us through installation and configuration of Apache Flume:

The following was run on a fresh Ubuntu 16.04.2 LTS installation. The machine I’m using has an Intel Core i5 4670K clocked at 3.4 GHz, 8 GB of RAM and 1 TB of mechanical storage capacity.

First I’ve setup a standalone Hadoop environment following the instructions from my Hadoop 3 installation guide. Below I’ve installed Kafkacat for feeding and reading off of Kafka, libsnappy as I’ll be using Snappy compression on the Kafka topics, Python, Screen for running applications in the background and Zookeeper which is used by Kafka for coordination.

From there, Mark has the configuration scripts and processes to get the entire pipeline built.

Comments closed

Parsing HL7 Messages With Python

Cristian Satnic has HL7 formatted messages in SQL Server and wishes to parse them using Python:

Each line in the HL7 message is called a segment and then each segment is split into individual fields by | (pipe) characters (typically). HL7 fields have well-defined names and meanings … for example in the example above PID-3 (the 3rd field in the PID segment where the identifier ‘PID’ is not counted) is 12001 and that represents the patient identifier.

For this particular project I’m working on we have HL7 messages stored in a SQL Server 2016 database table where each row in the table contains the raw HL7 2.x message in a particular column. I need to be able to intelligently filter over this HL7 data by looking at values in particular HL7 fields (as shown above). Since this HL7 data is stored in a varchar(MAX) column I could certainly attempt to play games using LIKE comparisons in SQL but that would not get me very far. SQL simply does not understand the complex structure of HL7 and I have no native SQL Server functions at my disposal that I could quickly use to parse this data and filter it.

Cristian has a Jupyter Notebook which takes us through the solution. With SQL Server 2017, there’s the possibility of solving this in a stored procedure using Machine Learning Services.

Comments closed