Misc Languages – Page 23

Spark+AI Summit 2019 Announcements

Published 2019-05-01 by Kevin Feasel

Victoria Holt creates a roundup of Spark+AI Summit 2019 announcements:

Rohan Kumar of Microsoft announced .NET for Apache Spark, making Apache Spark accessible to .NET developers – Git Hub

I’m very happy that the Spark for .NET team added F# support. Spark is so much nicer when using functional programming.

Comments closed

Azure Cloud Shell

Published 2019-04-30 by Kevin Feasel

Mark Broadbent gives us an introduction to Azure Cloud Shell:

There are two ways to access Azure Cloud Shell, the first being directly through the Azure Portal itself. Once authenticated, look to the top right of the Portal and you should see a grouping of icons and in particular, one that looks very much like a DOS prompt (have no fear, DOS is nowhere to be seen).
The second method to access Azure Cloud Shell is by jumping directly to it via shell.azure.com which will require you to authenticate to your subscription before launching. There is an ever so slight difference between each method. Accessing the Shell via the Azure Portal will not require you to specify your Azure directory context (assuming you have several) since your Portal will have already defaulted to one, whereas with the direct URL method that obviously doesn’t happen.

Read the whole thing.

Comments closed

Monads and Monoids and Functors

Published 2019-04-25 by Kevin Feasel

Anmol Sarna explains the concept of a monad:

In functional programming, a monad is a design pattern that allows structuring programs generically while automating away boilerplate code needed by the program logic.
To simplify the above definition a bit more, We can think of monads as wrappers. You just take an object and wrap it with a monad.
Let’s just be clear on one thing: A Monad is not a class or a trait; Neither is it only dedicated to the Scala language. It is a concept related to functional programming.

This also includes a few examples in Scala.

Comments closed

Reviewing the Stack Overflow Developer Survey

Published 2019-04-16 by Kevin Feasel

Michael Toth looks at the recently-released 2019 Stack Overflow Developer Survey:

Since 2011, Stack Overflow has been surveying their users each year to answer questions about the technologies they use, their work experience, their compensation, and their satisfaction at work. Given Stack Overflow’s place in the broader programming world, they are able to draw quite the audience for their annual surveys.
This year, nearly 90,000 developers participated in the survey! There’s a lot in this survey, and I recommend reviewing it yourself, but I wanted to surface some of the key findings that I thought were particularly relevant to data professionals here.
Stack Overflow says they will be releasing the underlying data for this survey in the coming weeks, so I hope to return to this for a deeper analysis once that’s made available. For now, let’s get into the results!

Michael’s lede involves R versus Python in terms of salaries, but for me, the top line is that functional programmers make more money. Clojure, F#, Scala, Elixir, and Erlang make the top 10 on the global list, including positions 1, 2, 4, and 5. Within the US, Scala, Clojure, Erlang, Kotlin, F#, and Elixir make the top 10, including positions 1, 2, and 4. H/T R-Bloggers

Comments closed

Understanding DNS for Developers

Published 2019-04-16 by Kevin Feasel

RJ Zaworski explains DNS for web developers:

DNS can use a similar TCP/IP stack, but being parts of a simple system, most DNS operations can also travel the wire on the Internet’s favorite Roulette wheel: the User Datagram Protocol, UDP.
On a good day, UDP is fast, simple, and stripped bare of unnecessary niceties like delivery guarantees and congestion management. But a UDP message may also never be delivered, or it may be delivered twice. It may never get a response, which makes for fun client design–particularly coming from the relatively safe and well-adjusted world of HTTP. With TCP, you get an established connection and all kinds of accommodations when Things Inevitably Go Wrong. UDP? “Best effort” delivery. Which means a packet thrown over the fence with a prayer for a soft landing.

It’s a good read if you’re new to DNS.

Comments closed

Using the Cosmos DB Change Feed

Published 2019-04-04 by Kevin Feasel

Hasan Savran (who just became a Microsoft MVP, so congrats to him) takes us through the Cosmos DB Change Feed:

Azure Cosmos DB Change Feed exposes Cosmos DB Logs to outside of CosmosDB. CosmosDB notifies you immediately when there is any change in your database. It supports all Inserts and Updates, Delete will be available soon. You can always use soft delete to catch delete events if you need to.

By knowing what is changed in your database, you can trigger all kind of events and you can make your application work very smart. SQL Server has similar functionality but like many other features Log shipping is usually blocked by DBAs or the company policies. In CosmosDB, you don’t need to do anything to enable Change Feed feature! It’s already enabled, all you need to do is to configure it. Easiest way to catch change feed events is Azure Functions.

When I hear someone describe the change feed, I immediately imagine it as a Kafka topic.

Comments closed

Bring .NET Support to Spark

Published 2019-03-28 by Kevin Feasel

I have a request that you vote up a Spark issue:

There is a Jira ticket for the Apache Spark project, SPARK-27006. The gist of this ticket is to bring .NET support to Spark, specifically by supporting DataFrames in C# (and hopefully F#). No support for Datasets or RDDs is included in here, but giving .NET developers DataFrame access would make it easy for us to write code which interacts with Spark SQL and a good chunk of the SparkSession object.

You an click through and read everything I have to say, but do go to the Spark ticket and vote for .NET support.

Comments closed

Working with Columns in Spark

Published 2019-03-26 by Kevin Feasel

Achilleus has a two-parter on working with columns in Spark. Part 1 covers some of the basic syntax and several functions:

Also, we can have typed columns which is basically a column with an expression encoder specified for the expected input and return type.
scala> val name = $"name".as[String] name: org.apache.spark.sql.TypedColumn[Any,String] = name scala> val name = $"name" name: org.apache.spark.sql.ColumnName = name
There are more than 50 methods(67 the last time I counted ) that can be used for transformations on the column object. We will be covering some of the important methods that are generally used.

Part 2 covers other functions including window functions:

17) over
This is one of the most important function that is used in many of the window operations.We can talk about the window function in detail when discuss about aggregation in spark but for now, it will be fair enough to say that over method provides a way to apply an aggregation over a window specification which in turn can be used to specify partition, order and frame boundaries of the aggregation.

Check out both of these posts for useful tidbits.

Comments closed

Selecting a List of Columns from Spark

Published 2019-02-26 by Kevin Feasel

Unmesha SreeVeni shows us how we can create a list of column names in Scala to pass into a Spark DataFrame’s select function:

Now our example dataframe is ready.
Create a List[String] with column names.
scala> var selectExpr : List[String] = List("Type","Item","Price") selectExpr: List[String] = List(Type, Item, Price)

Now our list of column names is also created.
Lets select these columns from our dataframe.
Use .head and .tail to select the whole values mentioned in the List()

Click through for a demo.

Comments closed

Case Classes In Scala

Published 2019-02-14 by Kevin Feasel

Shubham Dangare explains what case classes are in Scala:

Case class is scale way to allow pattern matching on an object without requiring a large amount of boilerplate. All you need to do is add a single case keyword modifier to each class that you want to pattern matching using such modifier makes scala compiler add some syntactic conveniences to your class and compiler add companion object(with the apply method)
Adds factory method with the name of the class this means that for instance, you can write StringValue(“X”) to construct a StringValue object instead of using new StringValue(“X”)

Given how useful case classes are in Spark, it’s good to know how they operate. For more background on the topic, Alessandro Lacava has a post from a few years back describing the topic well.

Comments closed

Category: Misc Languages