Press "Enter" to skip to content

Month: August 2019

Biml Support in Visual Studio Code

Cathrine Wilhelmsen takes us through Biml support in Visual Studio Code:

Please note that you only get syntax highlighting with this extension. You do not get the full Biml or .NET intellisense, the BimlScript preview pane, or the ability to generate SSIS packages from Biml. For those things, you will still need BimlExpress for Visual Studio.

However! If you simply want to view your Biml files in a lightweight editor, the Biml Support extension works beautifully

It’s not full support, but it’s something.

Comments closed

Database Page Allocations Function

Max Vernon takes us through the sys.dm_db_database_page_allocations Dynamic Management Function:

sys.dm_db_database_page_allocations is an undocumented SQL Server T-SQL Dynamic Management Function. This DMF provides details about allocated pages, allocation units, and allocation extents.

Read on for additional details. This is an undocumented function, so it might change between versions but it will give you an idea of how it works under the covers.

Comments closed

Using Bookmarks for Power BI Filters

Marc Lelijveld continues a series on storytelling with Power BI:

As said, being dynamic is a broad concept. Lets use the above shown example. As a report author, we can define that the end-user should be looking at an top 10 ranking of countries (right side of the report). Since the difference between number 9 and 10 in the ranking is so small, you might want to know what the difference is to number 11. Now, we can’t see that. We need to change the filter context to see the rest of the ranking.

Click through for a step by step example of what to do.

Comments closed

Managed Instance Challenges

Joey D’Antoni has a few real-world challenges with migrating to Azure SQL Managed Instances:

While DMS is pretty interesting tooling, I had mostly ignored it until recently. Functionally, the tool works pretty well. The problem is it requires a lot of privileges–you have to have someone who can create a service principal and you need to have the following ports open between your source machine and your managed instance:

– 443
– 53
– 9354
– 445
– 12000

While the scope of those firewall rules is limited, in a larger enterprise, explaining why you need port 445 open to anything is going to be challenging.

The technology is intriguing, though it does seem like there are still some kinks to work out.

Comments closed

Confluent Platform 5.3

Gaetan Castelein announces Confluent Platform 5.3:

Introducing Confluent Operator for Kubernetes
Kubernetes has become the open source standard for orchestrating containerized applications, but running stateful applications such as Kafka can be difficult and requires a specialized skill set. Thus, we decided to automate the process for you.

For the past few months, we have been working closely with a set of customers and partners as part of a preview program to gather their early feedback. We are now ready to release Confluent Operator, our enterprise-ready implementation of the Kubernetes Operator API to automate deployment and key lifecycle operations of Confluent Platform on Kubernetes.

Looks like they’ve been busy lately.

Comments closed

Stream Processing with Kafka

Satish Sharma has a four-part series covering stream processing with Apache Kafka. Part 1 gives us an overview of Kafka:

Apache Kafka is an open-source distributed stream processing platform originally developed by LinkedIn and later donated to Apache in 2011.

We can describe Kafka as a collection of files, filled with messages that are distributed across multiple machines. Most of Kafka analogies revolve around tying these various individual logs together, routing messages from producers to consumers reliably, replicating for fault tolerance, and handling failure gracefully. Its architecture inherits more from storage systems like HDFS, HBase, or Cassandra than it does from traditional messaging systems that implement JMS or AMQP. The underlying abstraction is a partitioned log, essentially a set of append-only files spread over several machines. This encourages sequential access patterns. A Kafka cluster is a distributed system that spreads data over many machines both for fault tolerance and for linear scale-out.

Part 2 covers terminology and concepts:

Kafka Streams API
Kafka Streams API is a Java library that allows you to build real-time applications. These applications can be packaged, deployed, and monitored like any other Java application — there is no need to install separate processing clusters or similar special-purpose and expensive infrastructures!

The Streams API is scalable, lightweight, and fault-tolerant; it is stateless and allows for stateful processing. 

Part 3 has you install and configure Kafka:

For quick testing, let’s start a handy console consumer, which reads messages from a specified topic and displays them back on the console. We will use the same to consumer to read all of our messages from this point forward. Use the following command: 

Linux -> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic tutorial-topic --from-beginning

Windows -> bin\windows\kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic tutorial-topic --from-beginning

Part 4 is forthcoming.

Comments closed

Learning Confluent Kafka

Stephane Maarek has some pointers on where to go to learn Confluent Kafka:

I teach online about Apache Kafka, and a very frequent and recurring question I get is:

How can I learn Confluent Kafka?

Let’s get right to it!

I’ve gone through a couple of Stephane’s Kafka courses and they’re excellent. There’s still a lot more for me to go through but if you’re interesting in learning Kafka, this is a great method.

Comments closed

PolyBase and Azul Zulu OpenJDK

I have a post looking at one of the more interesting changes in SQL Server 2019 CTP 3.2:

One of the more interesting parts of SQL Server 2019 CTP 3.2’s release notes is the relationship between Microsoft and Azul Systems. Travis Wright covers it in some detail, as well as what it means for customers.

Prior to SQL Server 2019 CTP 3.2, installing PolyBase required an installation of Oracle’s Java Runtime Environment 7 Update 51 or higher, either directly from Oracle or through OpenJDK.

Java is still required if you want to read from or write to Hadoop or Azure Blob Storage. Oracle’s flavor of Java is no longer required, however.

Comments closed

Keeping Bash Scripts Reusable

Kellyn Pot’vin-Gorman explains some of the concepts behind scripting for longevity:

I’m going to admit, that the reason I didn’t embrace Powershell at first, was most of the examples I found were of full of hardcoded values.  I found it incredibly obtuse, but I started to realize that it came from many sources who might not have the scripting history that those of other shells, (this was just my theory, not a lot of evidence to prove on this one, so keep that in mind…)  As Powershell scripts have matured, I’ve noticed how many are starting to build them with more dynamic values and advance scripting options, and with this, I’ve become more comfortable with Powershell.

I think the best way to learn is to see real examples, so let’s demonstrate.

Read on for those examples.

Comments closed

Tearing Down and Rebuilding Replication

Pamela Mooney takes us through tearing down replication, restoring a database, and rebuilding transactional replication with scripts:

If you use replication, you have had the situation occur where you had to restore a replicated database.  You’ve have doubtless been paged to restore a replicated database. You have experienced the ineffable joy of being tearing down replication-dependent indexed views (if you have them), blowing away replication, doing the restore, putting replication and indexing back together again, and finally redeploying your indexed views.  I know I have.

In fact, I’ve done it enough times that I didn’t want to do it anymore. So, you may ask, did I go to a different modality of replicating my data?  Did I go to Availability Groups or mirroring instead?  No.  I actually like replication.  It’s invaluable when you need to write code around real-time data (especially from a third party database), but you aren’t able to index the original copy.  It’s been around for a long time and is well vetted, and pretty forgiving, once you understand how it works.  So, no need to reinvent the wheel. I decided to automate replication instead.

This is specific to transactional replication. There’s a whole ‘nother kettle of fish for merge replication.

Comments closed