Press "Enter" to skip to content

Category: Misc Languages

Apache Spark Performance Tuning

Tomaz Kastrun provides a few hints when performance tuning Apache Spark code:

DataFrame versus Datasets versus SQL versus RDD is another choice, yet it is fairly easy. DataFrames, Datasets and SQL objects are all equal in performance and stability (at least from Spar 2.3 and above), meaning that if you are using DataFrames in any language, performance will be the same. Again, when writing custom objects of functions (UDF), there will be some performance degradation with both R or Python, so switching to Scala or Java might be a optimisation.

Read on for the details. My version is “When performance matters the most, be willing to switch to Scala.” It’s not always correct, but is rarely outright bad advice.

Comments closed

Using Scala in a Databricks Notebook

Tomaz Kastrun take a look at the original Spark language:

Let us start with Databricks datasets, that are available within every workspace and are here mainly for test purposes. This is nothing new; both Python and R come with sample datasets. For example the Iris dataset that is available with Base R engine and Seaborn Python package. Same goes with Databricks and sample dataset can be found in /databricks-datasets folder.

Click through for the walkthrough and introduction to Scala as it relates to Apache Spark.

Comments closed

Creating an app with Suave and F#

Diogo Souza shows off the Suave framework:

F# is the go-to language if you’re seeking functional programming within the .NET world. It is multi-paradigm, flexible, and provides smooth interoperability with C#, which brings even more power to your development stack, but did you know that you can build APIs with F#? Not common, I know, but it’s possible due to the existence of frameworks like Suave.io.

Suave is a lightweight, non-blocking web server. Since it is non-blocking, it means you can create scalable applications that perform way faster than the ordinary APIs. The whole framework was built as a non-blocking organism.

I will shout from the rooftops that data platform developers should learn functional programming. In the .NET space, that’s F#.

Comments closed

An Intro to Time Series Databases

Kyle Buzzell looks at time series databases:

As the name implies, a time series database (TSDB) makes it possible to efficiently and continuously add, process, and track massive quantities of real-time data with lightning speed and precision. While other database models have been used for these kinds of workloads in the past, TSDBs utilize specific algorithms and architecture to deal with their unique needs.

In this piece, we’ll take a deeper look at time series databases, including the unique needs of the workloads they’re built for, their benefits, common use cases, and the TSDBs out there.

Click through for an overview. Time series databases are definitely a niche product, but they are really good inside that niche.

Comments closed

Stored Procedure Return Values and Entity Framework Core

Erik Ejlskov Jensen shows us how to retrieve the return value from a stored procedure using Entity Framework Core:

SQL Server stored procedures can return data in three different ways: Via result sets, OUTPUT parameters and RETURN values – see the docs here.

I have previously blogged about getting result sets with FromSqlRaw here and here.

I have blogged about using OUTPUT parameters with FromSqlRaw here.

In this post, let’s have a look at using RETURN values.

Click through for the process.

Comments closed

Setting the Default Command Timeout with Microsoft.Data.SqlClient

Erik Ejlskov Jensen shows us a way to set a default command timeout in .NET’s Microsoft.Data.SqlClient:

With the latest 2.1.0 preview 2 release of the open source .NET client driver for Microsoft SQL Server and Azure SQL Database, Microsoft.Data.SqlClient, it is now possible to set the default command timeout via the connection string.

Now you can work around timeout issues simply by changing the connection string, where this previously required changes to code, and maybe changes to code you did not have the ability to change.

This is pretty nice, as my recollection was that you could set connection timeout via connection string, but not command timeout. And not everything’s going to wrap up nicely within 30 seconds.

1 Comment

Creating Power BI External Tools in VS Code

Phil Seamark takes us through creating external tools in Power BI:

For this article, I want to share a way for you to create your own Power BI “Helper Tool” and register it as an external tool in Power BI. This article carries on from some of my recent articles on how you can use Visual Studio Code to help automate specific tasks by taking advantage of the existing Analysis Services client libraries.

In my role, I often connect to AS models (Power BI or Azure AS) and often want to perform specific tasks quickly. The helper tool I share here allows you to connect easily to an AS model and then perform helpful tasks. I’ve deliberately kept the look and feel of the tool to be ‘old school’ like me. 

Click through for the step-by-step instructions.

Comments closed

Records in C# 9

Patrick Smacchia walks us through record types in C# 9:

The second core property of string and record value-based semantic is immutability. Basically, an object is immutable if its state cannot change once the object has been created. Consequently, a class is immutable if it is declared in such way that all its instances are immutable.

I remember a discussion with a developer that got nervous about immutability. It looked like an unnatural constraint to him: he wanted his object’s state to change. But he didn’t realized that something he used everyday – string operations – relied on immutability. When you are modifying a string actually a new string object gets created. Records behave the same way. Moreover a clean new syntax based on the keyword with has been introduced with C#9. 

They aren’t as fancy as F# record types, but it is fun to watch C# move slowly to being a functional-friendlier language—something which has been the case since Don Syme helped implement generics in C#.

Comments closed