Press "Enter" to skip to content

Category: Misc Languages

Adding IN Search Functionality To .NET

Jay Robinson shows off a few extension methods he creates to make dealing with C# easier:

Then I could use the extension like this:

if (mySeries.In(Enum.Series.ProMazda, Enum.Series.Usf2000)) myChassis = "Tatuus";

As for the other two methods, well… When is a null not a null? When it’s a System.DBNull.Value, of course! SQL Server pros who have spent any time in the .NET Framework will recognize this awkwardness:

var p = new System.Data.SqlClient.SqlParameter("@myParam", System.Data.SqlDbType.Int);
p.Value = (object)myVar ?? System.DBNull.Value;

With the extension, the second line becomes:

p.Value = mVar.ToDbNull();

I like it that Jay ended up going with a different language than T-SQL.  It’s no F#, but it’ll do.

Comments closed

Extractors In Scala

Jyoti Sachdeva explains what extractors are in Scala and why they’re useful:

An extractor is an object that has an unapply method. It takes an object as an input and gives back arguments. Custom extractors are created using the unapply method. The unapply method is called extractor because it takes an element of the same set and extracts some of its parts,  apply method also called injection acts as a constructor, takes some arguments and yields an element of a given set.

Click through for explanatory examples.

Comments closed

Machine Learning With F#

Diogo Souza gives us an introduction to using Accord.NET in F#:

F# is a scripting as well as a REPL language. REPL comes from Read-Eval-Print Loop, which means that the language processes single steps one at a time like reading the user inputs (usually expressions), evaluating their values and, in the end, returning the result to the same user. All that happens in a loop until the loop ends. Visual Studio provides a great F# Interactive view that runs the scripts in REPL mode and shows the results. Take the following Hello World example:

This code just creates a single variable (let keyword) and assigns a string value to it. When you run this code (select all the code text and press Alt + Enter), you’ll see the following result in the F# Interactive window (Figure 5):

You can also use C# with Accord.NET, but there’s a strong bias toward F# among people in the .NET space who work with ML, for the same reason that there’s a bias toward Scala over Java for Spark developers:  the functional programming paradigm works extremely well with mathematical concepts.  Also, in addition to Accord.NET, you might also want to check out Math.NET.  My experience has been that this package tends to be a bit faster than Accord.

Comments closed

Enabling LDAP Authentication On Cassandra

Kurt Greaves shows off a new LDAP authenticator for Apache Cassandra:

The LDAPAuthenticator is implemented using JNDI, and authentication requests will be made by Cassandra to the LDAP server using the username and password provided by the client. At this time only plain text authentication is supported.

If you configure a service LDAP user in the ldap.properties file, on startup Cassandra will authenticate the service user and create a corresponding role in the system_auth.roles table. This service user will then be used for future authentication requests received from clients. Alternatively (not recommended), if you have anonymous access enabled for your LDAP server, the authenticator allows authentication without a service user configured. The service user will be configured as a superuser role in Cassandra, and you will need to log in as the service user to define permissions for other users once they have authenticated.

The authenticator itself is hosted on GitHub, so you can check out its repo too.

Comments closed

When To Use SQL, DAX, Or M In Power BI Models

Paul Turley offers up some guidance on when to use which language when building Power BI models:

As a general rule of thumb, in formal SSAS projects built on a relational data mart or data warehouse that is managed by the same project team as the BI data model, I typically recommend that every table in the model import data from a corresponding view or UDF stored and managed in the relational database. Keep in mind that is the way we’ve been designing Microsoft BI projects for several years. Performing simple tasks like renaming columns in the SSAS data model designer was slow and cumbersome. Performing this part of the data prep in T-SQL was much easier than in SSDT. With the recent advent of Power Query in SQL Server Data Tools, there is a good argument to be made for managing those transformations but the tool is still new and frankly I’m still testing the water. Again, keep changes in one place for future maintenance.

Do your absolute best to avoid writing complex SQL query logic that cannot be traced back to the sources. Complicated queries can become a black box – and a Pandora’s box if they aren’t documented, annotated and easy to decipher.

But do read Paul’s closing grafs on the importance of not being hidebound.

Comments closed

Mapping In Scala When Dealing With Futures & Options

Shubham Verma shows us what happens when we use the map() and flatMap() functions in Scala on fields which are marked as Options or Futures:

now let’s move towards the interesting part flatMap(), what it is supposed to do in case of Option so the flatMap() gives us the liberty to return the type of value that we want to return after the transformation, unlike map() in wherein when a parameter has the value Some the value would be of type Some what so ever, but its not with the case with flatMap()

scala> option.flatMap(x => None)
res13: Option[Nothing] = None
scala>
scala> option.map(x => None)
res14: Option[None.type] = Some(None)

the code snippet above clearly shows it, so is that it ? No not yet lets look on to one more feature of flatMap() on Option[+A]that comes to be real handy when we need to extract value out of options, supposedly we have list of type List[Option[Int]] now I am only interested in values that have some value which seems to be an obvious usecase in most of the times, we can simply do it using a ​flatMap() operation

In short, it’s a little more complex, but you can still get useful information.

Comments closed

Calculating TF-IDF Using Apache Spark

Arseniy Tashoyan shows us how to calculate Term Frequency-Inverse Document Frequency using Apache Spark:

TF-IDF is used in a large variety of applications. Typical use cases include:

  • Document search.
  • Document tagging.
  • Text preprocessing and feature vector engineering for Machine Learning algorithms.

There is a vast number of resources on the web explaining the concept itself and the calculation algorithm. This article does not repeat the information in these other Internet resources, it just illustrates TF-IDF calculation with help of Apache Spark. Emml Asimadi, in his excellent article Understanding TF-IDF, shares an approach based on the old Spark RDD and the Python language. This article, on the other hand, uses the modern Spark SQL API and Scala language.

Although Spark MLlib has an API to calculate TF-IDF, this API is not convenient to learn the concept. MLlib tools are intended to generate feature vectors for ML algorithms. There is no way to figure out the weight for a particular term in a particular document. Well, let’s make it from scratch, this will sharpen our skills.

Read on for the solution.  It seems that there tend to be better options today than TF-IDF for natural language problems, but it’s an easy algorithm to understand, so it’s useful as a first go.

Comments closed

Exception Handling In Scala

Shivangi Gupta shows off the Either keyword in Scala:

How to get values from Either?

There are many ways we will talk about all one by one.  One way to get values is by doing left and right projection. We can not perform any operation i.e, map, filter etc; on Either. Either provide left and right methods to get the left and right projection. Projection on either allows us to apply functions like map, filter etc.

For example,

scala> val div = divide(14, 7)
div: scala.util.Either[String,Int] = Right(2)

scala> div.right
res1: scala.util.Either.RightProjection[String,Int] = RightProjection(Right(2))

When we applied right on either, it returned RightProjection. Now we can extract the value from right projection using get, but if there is no value the compiler will blow up using get.

There’s more to Scala exception handling than just try-catch.

Comments closed

Serializing Data In Scala

Akhil Vijayan has a two-parter on serializing data in Scala.  In the first post, he looks at uPickle:

uPickle serializer is a lightweight Json library for scala. uPickle is built on top of uJson which are used for easy manipulation of json without the need of converting it to a scala case class. We can even use uJson as standalone too. In this blog, I will focus only on uPickle library.

Note: uPickle does not support Scala 2.10; only 2.11 and 2.12 are supported

uPickle (pronounced micro-pickle) is a lightweight JSON serialization library which is fast than many other json serializers. I will talk more about the comparison of different serializers in my next blog. This blog will cover all the basic stuff about uPickle.

Then, he follows up with a comparison to other serializers:

In my previous blog, I talked about how uPickle works. Now I will be comparing it will many other json serializers where I will be serializing and deserializing scala case class.

Before that let me discuss all the json serializers that I have used in my comparison. I will compare uPickle with PlayJson, Circe and Argonaut.

Check it out.

Comments closed

Selecting All Columns But One In Postgres

Lukas Eder shows off a BigQuery feature which you can partially implement in Postgres:

In BigQuery syntax, we could now simply write

SELECT * EXCEPT rk
FROM (...) t
WHERE rk = 1
ORDER BY first_name, last_name

Which is really quite convenient! We want to project everything, except this one column. But none of the more popular SQL databases support this syntax.

Luckily, in PostgreSQL, we can use a workaround: Nested records:

SELECT (a).*, (f).* -- Unnesting the records again
FROM (
  SELECT 
    a, -- Nesting the actor table
    f, -- Nesting the film table
    RANK() OVER (PARTITION BY actor_id ORDER BY length DESC) rk
  FROM film f
  JOIN film_actor fa USING (film_id)
  JOIN actor a USING (actor_id)
) t
WHERE rk = 1
ORDER BY (a).first_name, (a).last_name;

Notice how we’re no longer projecting A.* and F.* inside of the derived table T, but instead, the entire table (record). In the outer query, we have to use some slightly different syntax to unnest the record again (e.g. (A).FIRST_NAME), and we’re done.

Read the whole thing.  Lukas has a workaround for SQL Server, but I’d really like to see SELECT * EXCEPT [something] be viable syntax.  This is something I’d want to use more for ad hoc diagnostic queries, but I have one scenario where most columns on a table are narrow but then I have a big VARBINARY(MAX) (for good reason, I promise) that I almost never want to see in diagnostic queries.  I use a third-party SSMS plugin to populate all the columns and remove the one I don’t want, but it’d be nice to specify the other way because it’s so much faster to type.

Comments closed