Using Azure Cloud Shell

Jeffrey Verheul shows off a bit of Azure Cloud Shell:

Connecting to a database
Now that your Cloud Shell is ready to go, you can start using Bash. This means you can also use sqlcmd from within Bash.

You can connect to a database with sqlcmd, by using the following command:

sqlcmd -S -U username -P password
Once the connection to your database has been made, you can run queries against it.

There’s no Powershell support yet, but Bash is currently supported and Powershell is in the works.

Golang And SQL Server

Mat Hayward-Hill gives us another language to think about:

Right now I spend most of my time in Management Studio writing TSQL. And I use PowerShell whenever I need to do something on more than one machine at a time. But now Microsoft is embracing open source should I be thinking the same and learn a new language which isn’t so Microsoft-centric.

After talking to some experts, I narrowed the choice down to two; Python and Go (also referred to as Golang). I picked Golang as it’s relatively new (open sourced in 2009 but for a language is leading-edge, whereas Python dates back to the late 1980s); nothing more complicated than that as this project is just for fun!

I’d see this as more of a “Cool, I can do this now” type of language rather than a “Hey, drop what you’re doing and learn this!” language.  That may change over the next few years.

Code Formatting

Bert Wagner has a few tips on code formatting to make it more readable:

The second example above consistently indents lines, adds new lines, and follows consistent coding patterns. This makes it easy to skim the code quickly.

Books have chapters, headings, and paragraphs defined by formatting that make it easy to find what is needed at a glance — formatting code makes it possible to find things easily too.

The examples Bert uses are all C#, but apply to most languages.  I think consistency is key, even more so than your ideal format.  This reduces friction between developers, at least outside of the “what should our coding standards be?” meetings…

OCR With Tesseract

Amuda Adelou shows how to use Tesseract’s Java API to perform character recognition in images:

Extracting text from an image means that you are considering the flowchart imagery that’s processed to extract the text components and then extracting the geometrical shapes components. The text components are extracted with geometrical components, as well. The internal relationship between the components is set up by tracing the flow lines that connect different components. The extracted components are output to metadata (in XML format), which is machine-readable. This metadata can be archived, stored in a knowledge base, or shared with others.

Click through for a demo app and code.

Minimizing Shared State

Vladimir Khorikov has an example of removing shared state and making application code more “honest” as a result:

Note how we’ve removed the private fields. Getting rid of the shared state automatically decoupled the three methods and made the workflow explicit. Without the shared state, the only way we can carry data around is by using the methods’ arguments and return values. And that is exactly what we did: all three members now explicitly state required inputs and possible outputs in their signatures.

This is the essence of functional programming. With honest method signatures, it’s extremely easy to reason about the code as we don’t need to keep in mind hidden relationships between its different parts. It’s also impossible to mess up with the invocation order. If we try, for example, to put the second line above the first one, the code simply wouldn’t compile:

This is one of many reasons why I’m fond of functional programming.

Case-Insensitive Power Query Sorts

Cedric Charlier points out a comaprisonCriteria on Table.Sort in Power Query:

Have you already tried to sort a table based on a text field? The result is usually a surprise for most people. M language has a specific implementation of the sort engine for text where upper case letters are always ordered before lower case letters. It means that Z is always before a. In the example (here under), Fishing Rod is sorted before Fishing net.

The classical trick to escape from this weird behavior is to create a new column containing the upper case version of the text that will be used to sort your table, then configure the sort operation on this newly created column. This is a two steps approach (Three steps, if you take into account the need to remove the new column). Nothing bad with this except that it obfuscates the code and I hate that.

Click through to learn a more elegant way of sorting.

Embedded Solr With Scala

Anurag Srivastava shows how to use Embedded Solr using an example written in Scala:

Embedded Solr has the same interface as Solr without requiring an HTTP connection. When we “embed” Solr into a Java an application, it provides the exact same API that you would use if you were connecting to a remote Solr instance. We can use embedded Solr for in-memory testing because when we implement test cases, it should not depend on any external resources.

Read on for the code sample.

Weights In Graphs

Angshuman Talukdar shows how to use neo4j to solve minimum weighted distance problems:

A sample dataset is created in Neo4j using the CREATE clause in Cypher as given in Query 1 (create clause in Cypher). This loads the data into Neo4j and generates the graph database as shown in Figure 2.

Neo4j has a lot of graph algorithms shipped with it as a package and those are accessible only from the JAVA API. Implementing some of these algorithms in Cypher is quite complex and time consuming. From Neo4j 3.x, the concept of user defined procedures had been introduced called APOC (Awesome Procedures On Cypher). Those are custom implementations of certain functionality, that can’t be (easily) expressed in Cypher itself. The APOC library consists of many (about 300) procedures to help with many different tasks in areas like data integration, graph algorithms or data conversion.

Graph databases aren’t common, but they can be very useful for certain questions like the one Angshuman solves.

Converting To Local Time In M

Chris Webb shows how to convert a datetime from UTC to your local time zone using M:

Here’s a brief explanation of what the query does:

  • First it reads the times from the Excel table and sets the Time column to be datetime data type

  • It then creates a new column called UTC and then takes the values in the Time column and converts them to datetimezone values, using the DateTime.AddZone() function to add a time zone offset of 0 hours, making them UTC times

  • Finally it creates a column called Local and converts the UTC times to my PC’s local time zone using the DateTimeZone.ToLocal() function

There are some limitations to what it does, so you can’t convert to just any time zone while still retaining Daylight Savings Time awareness.

Concurrency In Scala

Matthew Rathbone shows different concurrency options available in Scala:

Scala is a functional programming language that aims to avoid side effects by encouraging you to use immutable variables (called ‘values’), and data structures.

So by default in Scala when you build a list, array, string, or other object, that object is immutable and cannot be changed or updated.

This might seem unrelated, but think about a thread which has been given a list of strings to process, perhaps each string is a website that needs crawling.

In the Java model, this list might be updated by other threads at the same time (adding / removing websites), so you need to make sure you either have a thread-safe list, or you safeguard access to it with the protected keyword or a Mutex.

By default in Scala this list is immutable, so you can be sure that the list cannot be modified by other threads, because it cannot be modified at all.

While this does force you to program in different ways to work around the immutability, it does have the tremendous effect of simplifying thread-safety concerns. The value of this cannot be understated, it’s a huge burden to worry about thread safety all the time, but in Scala much of that burden goes away.

Read the whole thing if you’re looking at writing Spark applications in Scala.  If you’re thinking about functional programming in .NET languages, F# is  there for you.


May 2017
« Apr