Press "Enter" to skip to content

Category: Misc Languages

Custom Windows in Apache Flink

Alexander Fedulov walks us through window options with Apache Flink:

In the previous articles of the series, we described how you can achieve flexible stream partitioning based on dynamically-updated configurations (a set of fraud-detection rules) and how you can utilize Flink’s Broadcast mechanism to distribute processing configuration at runtime among the relevant operators. 

Following up directly where we left the discussion of the end-to-end solution last time, in this article we will describe how you can use the “Swiss knife” of Flink – the Process Function to create an implementation that is tailor-made to match your streaming business logic requirements. Our discussion will continue in the context of the Fraud Detection engine. We will also demonstrate how you can implement your own custom replacement for time windows for cases where the out-of-the-box windowing available from the DataStream API does not satisfy your requirements. In particular, we will look at the trade-offs that you can make when designing a solution which requires low-latency reactions to individual events.

This article will describe some high-level concepts that can be applied independently, but it is recommended that you review the material in part one and part two of the series as well as checkout the code base in order to make it easier to follow along.

It’s worth giving this a careful read.

Comments closed

Creating Power BI Measures via Visual Studio Code

Phil Seamark goes one step further with TOM:

My last blog introduced the idea of using Microsoft Visual Studio Code to work with Power BI Models. For this article, I build on that idea by showing how you can use a TOM based script to automatically generate measures in your model Power BI (or Azure Analysis Services) model.

For simplicity, the example in this blog will do the following:

– Connect to an instance of Power BI Desktop
– Iterate through every Table in the model
– Iterate through every Column in the “current” table from the outer loop
– If the Column is numeric and not hidden, create a simple [Sum of <column>] measure

Read on for demonstration code and a walkthrough of the process.

Comments closed

The Basics of Gremlin

Raul Gonzalez introduces us to Gremlin:

Graph databases in Cosmos DB benefit from the same features, like the SQL API, it is globally distributed, scales independently throughput and storage, provides guaranteed latency, automatic indexing and more. So when relational databases choke with certain queries, No-SQL databases come to play.

Gremlin is the query language used by Apache Tinkerpop and it is implemented in Azure Cosmos DB. This language enables us to transverse graphs and answer complex queries that would be otherwise very expensive to run in traditional relational database engines.

Read on for a detailed example.

Comments closed

Changing Power BI Models with Visual Studio Code

Phil Seamark has a integration I hadn’t expected to see:

Visual Studio Code is a reasonably new development environment which is a lightweight and quick install and get up and running. There is nothing you can do in VS Code that you can’t also do in another tool using TOM. I just thought it would be fun to show how quick and easy it is to get up and running in very few steps.

The following exercise uses VS Code to connect and manage a Power BI Desktop model. You can also connect to models hosted in Azure Analysis Services as well as models hosted in Power BI Premium.

Read on to see how to get everything going.

Comments closed

TF-IDF using Spark .NET

Ed Elliott shows how you can use the Spark .NET library to perform machine learning in Apache Spark:

Native spark has two API’s for creating your ML applications. The historical one is Spark.MLLib and the newer API is Spark.ML. A little bit like how there was the old RDD API which the DataFrame API superseded, Spark.ML supersedes Spark.MLLib.

At the end of last year, .NET for Apache Spark had no support for either the Spark.ML or Spark.MLLib API’s so I started implementing Spark.ML. In a similar way that .NET for Apache Spark supports the DataFrame API and not the RDD API, I started with Spark.ML and I believe that having the full Spark ML API will be enough for anyone.

It’s awesome that Ed is helping to move Spark .NET forward in this way.

Comments closed

Drill-Down Tables in Cube.js

Artyom Keydunov shows off drill-down tables in Cube.js:

Since the release of drill down support in version 0.19.23, you can build interfaces to let users dive deeper into visualizations and data tables. The common use case for this feature is to let users click on a spike on the chart to find out what caused it, or to inspect a particular step of the funnel — who has converted and who has not.

In this blog post, I’ll show you how to define drill downs in the data schema and build an interface to let users explore the underlying chart’s data. If you’re just starting with Cube.js, I highly recommend beginning with this Cube.js 101 tutorial and then coming back here. Also, if you have any questions, don’t hesitate to ask them in our Slack community.

Click through for the demo, as well as links to the source code and an online example.

Comments closed

Traits in Scala

Sanjana Aggarwal explains the notion of Traits in Scala:

Traits are a fundamental unit of code reuse in Scala. Trait encapsulates method and field definitions, which can be reused by mixing into classes.

Two most important concept about Traits are :-

– Widening from thin interface to rich interface
– Defining stackable modifications.

There are some rather powerful things you can do with traits in Scala, though it’s important to be careful not to overdo it.

Comments closed

The Either Monad in Scala

Jyoti Sachdeva explains some of the power of Either:

We use Options in scala but why do we want to go for Either?

Either is a better approach in the respect that if something fails we can track down the reason, which in Option None case is not possible.
We simply pass None but what is the reason we got None instead of Some. We will see how to tackle this scenario using Either.

This is a classic functional programming pattern and one of the easier monads to understand.

Comments closed

Tips for Debugging in Visual Studio

Patrick Smacchia has 12 tips for debugging in Visual Studio:

4) Data breakpoint: Break when value changes

If you set a breakpoint to a non-static property setter it will be hit when changing the property value for all objects. The same behavior can be obtained for a single object thanks to the Locals (or Watch) window right click : Break When Value Changes menu.

This facility is illustrated with the animation above. The hit occurs only when obj2.Prop is changed, not when obj1.Prop is changed.

These go a step beyond the basics, so check them out.

Comments closed

Pattern Matching in Scala

Mansi Babbar covers one of the most powerful tools available in functional programming languages like Scala:

The match expression consist of multiple parts:

1. The value we’ll use to match the patterns is called a candidate 
2. The keyword match
3. At least one case clause consisting of the case keyword, the pattern, an arrow symbol, and the code to be executed when the pattern matches
4. A default clause when no other pattern has matched. The default clause is recognizable because it consists of the underscore character (_) and is the last of the case clauses

This is similar to the switch statement in C-like languages but offers up a few more things like partial matching of complex objects. Mansi covers some of the ways in which these two things differ.

Comments closed