Press "Enter" to skip to content

Curated SQL Posts

Biml Metadata, With And Without BimlStudio

Ben Weissman has a pair of posts regarding metadata models in Biml.  First up, he gives us the high-roller solution:

If you’re lucky enough to be a BimlStudio user, you have access to the Biml Metadata feature! This feature allows you to build a Metadata model that fits your exacts needs which can then be browsed and used through a Metadata Instance using a dynamic object model.

As you probably still want to maintain your metadata outside of BimlStudio, we’ve build this little piece of code. It will ready your meta-Schema from a given SQL Database and build a Biml Metadata-Model from it. In a second step, it will also import the contents of your model into an instance:

If your company doesn’t want to shell out the cash to buy a license for BimlStudio, Ben also has a version for people using the free BimlExpress tool:

So maybe you’ve noticed our blog post on deriving metadata from SQL Server into BimlStudio, but you are using BimlExpress and therefore don’t have access to the feature described in there? While it’s true, that BimlExpress doesn’t support the Metadata features of the Biml language, there are similar ways of achieving a flexible metadata model in Biml.

This post shows you, how you can build a model in SQL Server, import it to Biml including derived relationships etc. and use it in a very flexible way.

To get started, we need to set up a sample model in SQL Server first. You can either download the scripts from https://solisyon.de/files/SetupDerivedMetadata.sql or scroll to the very end of that page. Although your individual model can and will differ from this one, we suggest you follow along using our example and start tweaking it afterwards!

Once you really get how Biml converts metadata to packages, life gets so much easier.

Comments closed

Using dbatools To Determine SQL Server Versions

Simone Bizzotto walks us through a new dbatools feature:

You get back on a jiffy:
– the Build
– the Major Release
– the Service Pack
– the Cumulative Update
– the KB related to that version
– when the support for that version ends
– if all of the above are matching a verified build
– if a warning is shown, you passed a bad build or the JSON must be updated

Getting the build is easy; getting some of this other information is where they add a lot of value.

Comments closed

Master Data In Azure

Matt How explains why Master Data Services isn’t a great cloud-based master data management solution and offers up an alternative:

Excel is easy to use, but not user friendly

Excel is on nearly every desktop in any Windows based organisation and with the Master Data Services Add-in, it puts the data well within the reach of the users. Whilst it is simple it is in no way user friendly when compared to other applications that your users may be using. Not to mention that for most this will be the only part of the solution they see! Wouldn’t it be great if there was a way to supply the same data but with an intuitive, mobile ready front end that people enjoy using?

Developers are tightly constrained

Developers like to develop, not choose options from drop down menus in a web based portal. With MDS, not only can Devs not make use of Visual Studio and a like but they are very tightly constrained by the business rules engine. At this point we should be able to make use of our preferred IDE so that we can benefit from source control, frameworks and customised business logic.

Not scalable according to modern expectations

Finally, MDS cannot scale to handle any kind of “big data”. It’s a bit of buzz word but as businesses collect more and more data, we need a data management option that can grow with that data. Due to the fact that MDS must be deployed from a server, there is no easy way to meet those big data requirements.

There are a few pieces to Matt’s solution, making for an interesting read.

Comments closed

Unintentional Data

Eric Hollingsworth describes data science as the cost of collecting data approaches zero:

Thankfully not only have modern data analysis tools made data collection cheap and easy, they have made the process of exploratory data analysis cheaper and easier as well. Yet when we use these tools to explore data and look for anomalies or interesting features, we are implicitly formulating and testing hypotheses after we have observed the outcomes. The ease with which we are now able to collect and explore data makes it very difficult to put into practice even basic concepts of data analysis that we have learned — things such as:

  • Correlation does not imply causation.
  • When we segment our data into subpopulations by characteristics of interest, members are not randomly assigned (rather, they are chosen deliberately) and suffer from selection bias.
  • We must correct for multiple hypothesis tests.
  • We ought not dredge our data.

All of those principles are well known to statisticians, and have been so for many decades. What is newer is just how cheap it is to posit hypotheses. For better and for worse, technology has led to a democratization of data within organizations. More people than ever are using statistical analysis packages and dashboards, explicitly or more often implicitly, to develop and test hypotheses.

This is a thoughtful essay well worth reading.

Comments closed

Crossing The Streams With Kafka

Himani Arora shows how to join two Kafka streams together:

KStream-KStream Join

It is a sliding window join, that means, all tuples close to each other with regard to time are joined. Time here is the difference up to size of the window.

These joins are always windowed joins because otherwise, the size of the internal state store used to perform the join would grow indefinitely.

In the following example, we perform an inner join between two streams. The output the joined stream will be of type KStream<K, ...>

Read on to learn more about two additional join types.

Comments closed

Measuring Semantic Relatedness

Sandipan Dey re-works a university assignment on semantic relatedness in Python:

Let’s define the semantic relatedness of two WordNet nouns x and y as follows:

  • A = set of synsets in which x appears
  • B = set of synsets in which y appears
  • distance(x, y) = length of shortest ancestral path of subsets A and B
  • sca(x, y) = a shortest common ancestor of subsets A and B

This is the notion of distance that we need to use to implement the distance() and sca() methods in the WordNet data type.

It looks like a helpful assignment for understanding natural language processing a little better.

Comments closed

Trigger Nuance

Denis Gobo offers up some good advice on triggers:

Most common mistake people make when first starting writing triggers is that they write it in such a way that it will only work if you insert/update/delete one row at a time. A trigger fires per batch not per row, you have to take this into consideration otherwise your DML statements will blow up. How to do this is explained in this post Coding SQL Server triggers for multi-row operations, there is no point recreating that post here.
Another problem that I see is that some people think a trigger is SQL Server’s version of crontab, you will see code that sends email, kicks off jobs, runs stored procedures. This is the wrong approach, a trigger should be lean and mean, it should execute as fast as possible, if you need to do some additional things then dump some data from the trigger into a processing table and then use that table to do your additional tasks. Don’t use triggers as a messaging system either, SQL Server comes with Service Broker, use that instead.

Good reading.  There are valid reasons for triggers, and ignoring them altogether is almost as bad as misusing them.

Comments closed

Basics Of Elasticsearch In .NET

Ivan Cesar gives us a brief tutorial of the Elasticsearch .NET API:

To be able to search something, we must store some data into ES. The term used is “indexing.”

The term “mapping” is used for mapping our data in the database to objects which will be serialized and stored in Elasticsearch. We will be using Entity Framework (EF) in this tutorial.

Generally, when using Elasticsearch, you are probably looking for a site-wide search engine solution. You will either use some sort of feed or digest, or Google-like search which returns all the results from various entities, such as users, blog entries, products, categories, events, etc.

These will probably not just be one table or entity in your database, but rather, you will want to aggregate diverse data and maybe extract or derive some common properties like title, description, date, author/owner, photo, and so on. Another thing is, you probably won’t do it in one query, but if you are using an ORM, you will have to write a separate query for each of those blog entries, users, products, categories, events, or something else.

Check out Ivan’s tutorial for several examples.  Elasticsearch is really good for text-based search and simple aggregations, but it probably shouldn’t be a primary data store for any data you really care about.

Comments closed

Spelunking In The SSRS REST API

Chris Webb uses Power BI to look at the new SQL Server Reporting Services 2017 REST-based API:

And the online documentation for the API is here:

https://app.swaggerhub.com/apis/microsoft-rs/SSRS/2.0

Interestingly, the new API seems to be OData compliant – which means you can browse it in Power BI/Get&Transform/Power Query and build your own reports from it. For example in Power BI Desktop I can browse the API of the SSRS instance installed on my local machine by entering the following URL:

This is something that SSRS has been missing for a long time.  I’m glad they’re introducing a real API.

1 Comment