Press "Enter" to skip to content

Day: February 6, 2018

The Basics Of Jupyter Notebooks

Nigel Meakings has an introductory post to Jupyter Notebooks:

The Jupyter notebok environment consists of a browser-based notebook UI and a back-end server, running on port 8888 by default (if this port is taken it will start up on the next available port). This web server-based delivery of Notebooks means that you can browse to a remote server and execute your code there. This is the case, for example, when using a ready-made cluster such as an HDInsight Spark cluster, where all the tooling has been pre-installed for you. You open the notebook in the cluster portal within Azure, and it logs you in to the Jupyter server running on a node within the cluster. Note that if you want to allow multi-user access to your local Jupyter environment, you’ll need to be running a product such as JupyterHub.

I love using Jupyter when presenting because it’s the easiest way to intermix code, documentation, and images in one package, so it’s nice for pedagogical purposes.

Comments closed

Active Directory Management With Powershell

Jana Sattainathan walks us through a few AD management scenarios using the Powershell Active Directory module:

If you are an AD admin, you are very likely a pro at managing AD group membership but for mere mortals, this can be a tedious task. Please read on to find out what I am talking about and how to address that easily

For example, you get an urgent request to

  • Move a bunch of users from one AD group to another

  • To make matters worse, you are not given any ID’s but rather just the names!

Jana shows how to whip up a script to perform this migration in a few lines of code, as well as several other scenarios.

Comments closed

Azure SQL Database Managed Instances

Tim Radney explains the idea behind Azure SQL Database Managed Instances:

Managed Instances were first introduced at the Microsoft Build conference in Spring 2017, and so far the preview has been limited to a small number of customers and consultants. Managed Instances can be considered a hybrid between a full version of SQL Server and Azure SQL Database. Single and elastic databases are built upon a database-scoped programming model and Managed Instances are built on an instance-scoped programming model. This makes Managed Instances more compatible with on-premises SQL Server.

Managed Instances provide much more of a feel of an on-premises SQL Server, yet are built on the same infrastructure as Azure SQL Database. What sets it apart from Azure SQL Database is that it presents an entire SQL Server instance to the customer. In Azure SQL Database, you configure a server, which is really a container, and then can have multiple databases in that server, but they can’t easily talk to one another. With Managed Instances, all databases within the instance are on the same actual SQL Server, so you have full support for cross-database queries. This is a huge feature for many applications that otherwise were not a good fit for Azure SQL Database, and I think that’s going to allow many more SQL Server applications to move into the cloud.

Managed Instances has the potential to make an Azure migration realistic for many companies sticking to the on-prem product or managing a bunch of IaaS VMs.

Comments closed

Clustered Indexes And Automatic Sorting

Kendra Little demonstrates that clustered indexes do not give us an automatic sorting of our data:

There is no “default” ordering that a query will fall back on outside of an ORDER BY clause.

  • Results may come back in the order of the clustered index, or they may not
  • Even if results come back in the order of the clustered index on one run of the query, they may not come back in the same order if you run it again

If you need results to come back in a specific order, you must be explicit about it in the ORDER BY clause of the query.

Click through for a demo proving these two points.

Comments closed

Recovering A Log-Shipped Database

Sander Stad wraps up his series on log shipping:

Initially log shipping was meant to be used as a warm standby. You have your data on another instance but you still have some human intervention to get it all back up.

Imagine the following situation. You have setup log shipping using either the GUI or the commands in dbatools. You have about 15 databases and everything is working fine.

Until one day the primary instance goes down and is not recoverable. For the production to continue you have to bring the log shipped databases online fast.

You have to figure what the last transaction log backup was. You have to check if it was copied to the secondary instance and if it’s restored.

To do this by running a couple of queries, copying the files if needed and run the log shipping jobs takes time. I’d rather run a command and recover one or more databases and get back to the problem of the primary instance.

Read on to see how to use dbatools to recover a log shipped database.

Comments closed

Building A Power BI Date Dimension

Martin Schoombee explains why you should have a date dimension in Power BI and then has a script which generates one:

If your source system does not contain a Date entity or dimension, a better way is to create a such an entity with Power Query (aka “M”). By creating your own, you can add more attributes than just year, quarter and month…and optimize the size and performance of your model at the same time.

Not a Power Query expert? You don’t have to be. Others in the community have already shared their versions, and I will share my own modified version that you can copy and use for yourself. I’ve used the examples of Matt Masson and Chris Webb as the basis of my own Date entity. You can find their versions here and here.

Read on for Martin’s version.

Comments closed

K-SSIS: Examining SSIS Packages

Ed Elliott has a new tool available:

The main part of the UI is a graphical map of the objects in a package. I have spent quite a lot of time trying to think of a better way of showing the elements that make up a package, and it is hard to argue with the workflow approach, all the ETL tools I found do the same thing and visualizing packages as a series of interconnected tasks is useful but maybe not everything and the only way.

Because of all these issues I have really tried to stay away from SSIS development and looked for other solutions such as using biml to write packages or writing code in other languages but I keep coming back to the fact that the SSIS engine is great at what it does and if you have a lot of data or a lot of files/tables then writing a similar framework would be a lot of wasted effort. So, how can we use the SSIS engine but make better, more efficient use of our development time?

Click through to see a couple of screenshots as well as a link to download K-SSIS.

Comments closed