Press "Enter" to skip to content

Day: January 17, 2019

Testing Kafka Streams Applications

Yeva Byzek continues her series on testing Kafka-based streaming applications:

When you create a stream processing application with Kafka’s Streams API, you create a Topologyeither using the StreamsBuilder DSL or the low-level Processor API. Normally, the topology runs with the KafkaStreams class, which connects to a Kafka cluster and begins processing when you call start(). For testing though, connecting to a running Kafka cluster and making sure to clean up state between tests adds a lot of complexity and time.

Instead, developers can unit test their Kafka Streams applications with utilities provided by kafka-streams-test-utils. Introduced in KIP-247, this artifact was specifically created to help developers test their code, and it can be added into your continuous integration and continuous delivery (CI/CD) pipeline.

Streaming applications need tested just like any other.

Comments closed

Combining Plots In R With cowplot

Abdul Majed Raja shows how to use the cowplot library in R to merge together independent plots into a single image:

The way it works in cowplot is that, we have assign our individual ggplot-plots as an R object (which is by default of type ggplot). These objects are finally used by cowplot to produce a unified single plot.

In the below code, We will build three different histograms using the R’s in-built dataset iris and then assign one by one to an R object. Finally, we will use cowplot function plot_grid() to combine the two plots of our interest.

The only thing that disappointed me with cowplot is that its name has nothing to do with cattle.

Comments closed

Classifying Texts With Naive Bayes

I continue my series on Naive Bayes with another hand-calculation post:

Step two is, on the surface, pretty tough: how do we figure out if a set of words is a business phrase or a baseball phrase? We could try to think up a set of features. For example, how long is the phrase? How many unique words does it have? Is there a pile of sunflower seeds near the phrase? But there’s an easier way.

Remember the “naive” part of Naive Bayes: all features are independent. And in this case, we can use as features the individual words. Therefore, the probability of a word being a baseball-related word or a business-related word is what matters, and we cross-multiply those probabilities to determine if the overall phrase is a baseball phrase or a business phrase.

Click through for a sports-heavy example and a bonus Nate Barkerson reference.

Comments closed

Non-Administrative Powershell Remoting And January 2019 LCU

Emin Atac tests out a security change made in the January 2019 Latest Cumulative Update for Windows:

My first concern was: if it’s a security vulnerability, what’s its CVE? The blog post answer is: CVE-2019-0543 discovered by James Forshaw of Google Project Zero

My second concern was twofold. Is the chapter about A Least Privilege Model Implementation Using Windows PowerShell published in the PowerShell Conference Book impacted by this change? Should I stop deploying Windows 10 at work because the LCU of January 2019 breaks my loopback scenario?

The answer is no and explained by the blog post Windows Security change affecting PowerShell
you would not be affected by this change, unless you explicitly set up loopback endpoints on your machine to allow non-Administrator account access

Read on for some testing and digging into what works when and why.

Comments closed

Name Conflicts In DAX

Marco Russo takes us through an issue with naming in DAX:

You deploy this model and start creating reports using the Sales Returning Customers measure. So far, so good. One day, you need to extend the data model importing a new table that you decide to name ReturningCustomers. As soon as you import the new table named ReturningCustomers, your measure Sales Returning Customers stops working. The reason is that the ReturningCustomers variable generates a name conflict with the table that has the same name, as you can see from the error message.

‘ReturningCustomers’ is a table name and cannot be used to define a variable.

Marco has some advice if you’re in a situation where you are liable to see this pop up.

Comments closed

Java With Visual Studio Code

Niels Berglund learns about Visual Studio Code and writing Java in VS Code:

I mentioned above how Maven is a build automation tool for primarily Java projects. There are other build tools as well, but in this post, I use Maven as it is – which I mentioned above – the de-facto standard for Java-based projects.

Above we saw how we installed Maven as well as the VS Code Maven extension, however before we start to use it let us talk a little about Maven archetypes and naming conventions.

If I were in Java day in and day out, I’d probably go with IntelliJ IDEA but for occasional work, I can see VS Code doing well. That’s the interesting use case for Visual Studio Code: with all of the extensions, it’s a really good multi-purpose IDE while still being worse than the “natural” IDE for almost any single language.

Comments closed

SQL Undercover Inspector V1.3

Adrian Buckman announces a new version of the SQL Undercover team’s Inspector:

We know some of you really hate linked servers so we have been working on a powershell collection which will allow you to install the inspector without using linked servers to centrally log the information and instead the powershell function Invoke-SQLUndercoverInspector will do the rest for you (We will be writing a blog post about how you can use this soon) – this is currently a pre-release version so it’s a work in progress – I must say a massive thank you to Shane O’Neill (b | t) without his powershell skills this wouldn’t turned out as well as it has, thanks Shane!

If you’ve already downloaded this version, be aware that there is a hotfix.

Comments closed

Management Studio Query Shortcuts

Michelle Haarhues shows how you can use query shortcuts in SQL Server Management Studio:

Back in the day, with the introduction of programs like Word and Excel, I used keyboard shortcuts to make my job easier.  Then we started using a mouse and reduced the number of keyboard shortcuts I used.  It took me a long time to switch from the keyboard shortcuts to the mouse.  Now I am back to using shortcuts, especially in SQL Server Management Studio (SSMS).  Microsoft allows users to create shortcuts that, if you use them, could make your job easier.  Setting up the shortcuts in SSMS are pretty simple.

My shortcuts are all around running sp_whoisactive: Ctrl+F1 gets results back the way I want, Ctrl+3 shows only my sessions, and Ctrl+4 gives me more details (like execution plans) when I’m willing to wait the extra time to get them.

Comments closed

Review: dbForge Studio For Database Modeling

Randolph West is looking for a product for database modeling and tries out dbForge Studio:

These days I still design new databases from scratch with pen and paper (or iPad and Apple Pencil), where the entity relationship diagram (ERD) is rudimentary and crows’ feet relationships are badly-scrawled. But it got me wondering which database modelling tools are on the market today (commercial and free).

My ideal tool should be able to design a new database from scratch and generate creation scripts in T-SQL without failing over common issues like referential integrity and dependencies. More importantly though, it should be able to reverse-engineer a database (like Microsoft Visio used to be able to). This is extremely useful for consulting engagements when I need to get a picture in my head of the database I’m looking at. This was the one place I’ve used the Database Designer in SSMS more than I had initially remembered.

Randolph also mentions SQL Database Modeler, which I used on a consulting engagement where I wanted to replicate Visio’s database reverse engineer functionality.

Comments closed