Category: Misc Languages

An Introduction To Splunk

Published 2018-02-14 by Kevin Feasel

Victoria Holt has some basics on Splunk:

Splunk, a software platform, has the capability to leverage machine data for data management and analytics. It can be used for

Data driven decision making

Alerts for network security threats

Report on system failures

Analyse and improve functionality

It enables performance analysis, dashboard creation, monitoring, troubleshooting and investigation of the real-time data collected. A Edureka learning video showed the Splunk components.

Advanced Splunk queries are still a bit like magic to me, but this is a very powerful service once you get a handle on how it works.

Comments closed

Analyzing Spatial Data With Cosmos DB

Published 2018-02-09 by Kevin Feasel

Ben Jarvis shows how to query spatial data from Cosmos DB:

The above code connects to Cosmos DB and retrieves the details for the base airfield that was specified, it then calculates the range of the aircraft in meters by multiplying the endurance (in hours) by the true airspeed in knots (nautical miles per hour) and then multiplying that my 1852 (number of meters in a nautical mile). A Linq query is then run against Cosmos DB using the built-in spatial functions to find airfields within the specified distance. The result is then converted into a JSON array that can be understood by the Google Maps API that is being used on the client side.

The client side uses the Google Maps API to plot the airfields on a map, giving us a view like the one below when given a base airfield of Blackbushe (EGLK), a true airspeed of 100kts and an endurance of 4.5 hours

Click through for .NET code to load and analyze the data.

Comments closed

Bash For The Powershell-Minded

Published 2018-02-05 by Kevin Feasel

Mark Wilkinson has started a new series on Bash. His first post is an introduction to the scripting language:

Bash (the Bourne Again Shell) was created in 1989 for the GNU Project as a free replacement for the Unix Bourne shell. Most modern Linux systems use Bash as their default command line shell, so if you have ever dropped to a command line on a Linux system, you have probably used Bash. Just like PowerShell, Bash is both a scripting language and a command shell/interpreter. So not only can you execute commands in an interactive shell session, but you can also write scripts that incorporate multiple commands.

Once you get your hands dirty with Bash you’ll notice a lot of features that were incorporated into PowerShell. Things like command substitution: $(Get-Date) were directly pulled from Bash $(date). Other features will look familiar as well, like the ability to pipe multiple commands together.

One thing you need to understand right away is that Bash is string based, not object based like PowerShell. This means you’ll find yourself doing a lot more string processing to get tasks done. Things like string splitting will be much more common. Bash does support objects, like arrays, but few if any commands output an array. As we go through this series you’ll see that this might not be as limiting as it sounds.

The best part about learning Bash is that you can then get into arguments about Bash vs ksh vs zsh.

Comments closed

A Definition Of Functional Programming

Published 2018-01-26 by Kevin Feasel

Kevin Sookocheff contrasts functional programming with its imperative cousin:

Functional programming is a form of declarative programming that expresses a computation directly as pure functional transformation of data. A functional program can be viewed as a declarative program where computations are specified as pure functions.

I think that if you’re a set-based SQL developer, functional programming languages will make the most intuitive sense. They’re a bit harder to wrap your mind around if you’ve grown up as an imperative C-style developer, but are still worth the effort.

Comments closed

AWS Glue Now Supports Scala

Published 2018-01-19 by Kevin Feasel

Mehul Shah, et al, announce that AWS Glue officially supports Scala:

We are excited to announce AWS Glue support for running ETL (extract, transform, and load) scripts in Scala. Scala lovers can rejoice because they now have one more powerful tool in their arsenal. Scala is the native language for Apache Spark, the underlying engine that AWS Glue offers for performing data transformations.

Beyond its elegant language features, writing Scala scripts for AWS Glue has two main advantages over writing scripts in Python. First, Scala is faster for custom transformations that do a lot of heavy lifting because there is no need to shovel data between Python and Apache Spark’s Scala runtime (that is, the Java virtual machine, or JVM). You can build your own transformations or invoke functions in third-party libraries. Second, it’s simpler to call functions in external Java class libraries from Scala because Scala is designed to be Java-compatible. It compiles to the same bytecode, and its data structures don’t need to be converted.

To illustrate these benefits, we walk through an example that analyzes a recent sample of the GitHub public timeline available from the GitHub archive. This site is an archive of public requests to the GitHub service, recording more than 35 event types ranging from commits and forks to issues and comments.

Functional languages tend to be very good for ETL tasks, and Scala is a great choice due to its relationship with Spark.

Comments closed

Breeze: Mathematics In Scala

Published 2017-12-28 by Kevin Feasel

Nitin Aggarwal introduces the mathematics library behind Spark’s machine learning library, MLlib:

In simple terms, Breeze is a Scala library that extends the Scala collection library to provide support for vectors and matrices in addition to providing a whole bunch of functions that support their manipulation. We could safely compare Breeze to NumPy in Python terms. Breeze forms the foundation of MLlib—the Machine Learning library in Spark

Breeze comprises four libraries:

breeze-math: Numerics and Linear Algebra. Fast linear algebra backed by native libraries (via JBlas) where appropriate.
breeze-process: Tools for tokenizing, processing, and massaging data, especially textual data. Includes stemmers, tokenizers, and stop word filtering, among other features.
breeze-learn: Optimization and Machine Learning. Contains state-of-the-art routines for convex optimization, sampling distributions, several classifiers, and DSLs for Linear Programming and Belief Propagation.
breeze-viz: (Very alpha) Basic support for plotting, using JFreeChart.

Read on for samples and basic usage.

Comments closed

The Triumph Of Functional Programming

Published 2017-12-20 by Kevin Feasel

Amanda LeClair and Michael Facemire have a new report on functional programming:

The customer-facing software development world is outgrowing stateful, object-oriented (OO) development. The bar for great, intuitive customer experience has been raised by ambient, conversation-driven user interfaces, like through Amazon Alexa. Functional programming allows enterprises to take better advantage of compute power to deliver those experiences at scale; better flexibility for delivering the right output; and a more efficient way of delivering customer value. FP also reduces regression defects in code, simplifies code creation and maintenance, and allows for greater code reuse.

Just as object-oriented programming (OOP) emerged as the solution to the limitations of procedural programming at the dawn of the internet boom in the mid-’90s, FP is emerging as the solution to the limitations of OOP today. The shift is already underway– 53% of global developers reported that at least some teams in their companies are practicing functional programming and are planning to expand their usage.

Alexey Sommer notes that functional programming has been sneaking into C# bit by bit for well over a decade:

Retrospective

C# 1.0 Visual Studio 2002

C# 1.1 Visual Studio 2003 – #line, pragma, xml doc comments

C# 2.0 Visual Studio 2005 – Generics, Anonymous methods, iterators/yield, static classes

C# 3.0 Visual Studio 2008 – LINQ, Lambda Expressions, Implicit typing, Extension methods

C# 4.0 Visual Studio 2010 – dynamic, Optional parameters and named arguments

C# 5.0 Visual Studio 2012 – async/await, Caller Information, some breaking changes

C# 6.0 Visual Studio 2015 – Null-conditional operators, String Interpolation

C# 7.0 Visual Studio 2017 – Tuples, Pattern matching, Local functions

I strongly believe that if you are a database developer and need to pick up a non-SQL programming language, functional languages will be a lot easier for you to get than object-oriented languages. Many of the principles line up much smoother with functional languages, as you can most clearly see with the relationship between Scala and Spark.

Comments closed

Pipes And More Pipes In R

Published 2017-12-19 by Kevin Feasel

Gabriel (de Selding?) has a tutorial on how to use the various pipes in R:

In F#, the pipe-forward operator |> is syntactic sugar for chained method calls. Or, stated more simply, it lets you pass an intermediate result onto the next function.

Remember that “chaining” means that you invoke multiple method calls. As each method returns an object, you can actually allow the calls to be chained together in a single statement, without needing variables to store the intermediate results.

In R, the pipe operator is, as you have already seen, %>%. If you’re not familiar with F#, you can think of this operator as being similar to the +in a ggplot2 statement. Its function is very similar to that one that you have seen of the F# operator: it takes the output of one statement and makes it the input of the next statement. When describing it, you can think of it as a “THEN”.

Auto-recommended for the F# love, and a good tutorial to boot.

John Mount has a few interesting notes on the topic:

data.table has essentially used the square bracket sequence “][” in a manner equivalent to piping in R since about 2006. Here is an example.
The Bizarro Pipe “->.;” has always been possible in R, though I worked it out and started teaching it in 2016. And it is as quick to type as any other notation once you bind it to a keyboard shortcut.

Read on for the rest of his notes, too.

Comments closed

Error Handling In Scala

Published 2017-12-15 by Kevin Feasel

Manish Mishra gives a few examples of how to handle errors in Scala:

Try[T] is another construct to capture the success or a failure scenarios. It returns a value in both cases. Put any expression in Try and it will return Success[T] if the expression is successfully evaluated and will return Failure[T] in the other case meaning you are allowed to return the exception as a value. However with one restriction that it in case of failures it will only return Throwable types:
def validateZipCode(zipCode:String): Try[Int] = Try(zipCode.toInt)
But Throwing an exception doesn’t make much sense here since it is not much of a calculation. Although we can take this example to understand the use case. If the given string is not a number, it will be a failure. The value from the Try can be extracted in same as Option. It can be matched

As you write more complicated Spark operations, handling errors becomes critical.

Comments closed

Azure Functions Basics

Published 2017-11-28 by Kevin Feasel

Vincent-Philippe Lauzon explains the basics of Azure Functions:

In general, serverless refers to an economical model where we pay for compute resources used as opposed to “servers”.

Wait… isn’t that what the Cloud is about?

Well, yes, on a macro-scale it is, but serverless brings it to a micro-scale.

In the cloud we can provision a VM, for example, run it for 3 hours and pay for 3 hours. But we can’t pay for 5 seconds of compute on a VM because it won’t have time to boot.

A lot of compute services have a “server-full” model. In Azure, for instance, a Web App comes in number of instances. Each instance has a VM associated to it. We do not manage that VM but we pay for its compute regardless of the number of requests it processes.

In a serverless model, we pay for micro-transactions.

This is the first part in a series and is aimed at giving a conceptual explanation.

Comments closed