Press "Enter" to skip to content

Month: June 2017

Security And Zookeeper

Michael Han describes a few methods you can use to tighten up (or rather, introduce) security in ZooKeeper:

Four Letter Words (acronym as 4lw) is a very popular feature of the Apache ZooKeeper project. In a nutshell, 4lw is a set of commands that you can use to interact with a ZooKeeper ensemble through a shell interface. Because it’s simple and easy to use, lots of ZooKeeper monitoring solutions are built on top of 4lw.

The simplicity of 4lw comes at a cost: the design did not originally consider security, there is no built in support for authentication and access control. Any user that has access to the ZooKeeper client port can send commands to the ensemble. The 4lw commands are read only commands: no actions can be performed. However, they can be computing intensive, and sending too many of them would effectively create a DOS attack that prevents the ensemble’s normal operation.

Read on for details.

Comments closed

Moving Docker Containers

Andrew Pruski shows how to change the default location for Docker containers on Windows:

There’s a switch that you can use when starting up the docker service that will allow you to specify the container/image backend. That switch is -g

Now, I’ve gone the route of not altering the existing service but creating a new one with the -g switch. Mainly because I’m testing and like rollback options but also because I found it easier to do it this way.

Read the whole thing.

Comments closed

Flattening JSON With Purrr

Steph Locke shows how to use purrr to write functional style code in R:

And… et voila! A multi-language dataset with the language identified and the sentiment scored using purrr for easier to read code.

Using purrr with APIs makes code nicer and more elegant as it really helps interact with hierarchies from JSON objects. I feel much better about this code now!

Purrr is something I really want to dig into for reasons just like this.

Comments closed

Cross-Platform Powershell Remoting

Anthony Nocentino shows how to enter Powershell sessions using OpenSSH-basted remoting:

Nothing special here, simple syntax, but the seasoned PowerShell remoting pro will notice that we’re using a new parameter here -HostName. Normally on Windows PowerShell you have the -ComputerName parameter. Now, I don’t know exactly why this is different, but perhaps the PowerShell team needed a way to differentiate between OpenSSH and WinRM based remoting. Further, Enter-PSSession now has a new parameter -SSHTransport which at the moment doesn’t seem to do much since remoting cmdlets currently use OpenSSH by default. But if you read the code comments here, it looks like WinRM will be the default and we can use this switch parameter to specify SSH as the transport.

Once we execute this command, you’ll have a command prompt to the system that passed as a parameter to -HostName. The prompt below indicates you’re on a remote system by putting the server name you’re connected to in square brackets then your normal PowerShell prompt. That’s it, you now have a remote shell. Time to get some work done on that server, eh? Want to get out of the session, just type exit.

It’s interesting to see how well Microsoft is integrating Linux support into Powershell (and vice versa, but that’s a different post).

Comments closed

Using OStress

Nikhilesh Patel explains how to use OStress to generate artificial database loads for stress testing:

OStress is a Microsoft tool comes with RML utilities package and it uses to stress SQL Server. This is especially useful when you want to troubleshoot SQL Server while SQL Server is under heavy load.

It is a free tool for SQL Server developers and DBAs. It is designed to assist with performance stress testing of T-SQL queries and routines. The tool automatically collects metrics to help you determine whether your queries will perform under load, and what kind of resource strain they put on a server. In short, it also allows putting a serious load on your database.

OStress isn’t the easiest thing in the world to set up, but it works well.

Comments closed

Change Detection With Hashes

Nigel Meakins shows how to use HashBytes to roll your own change detection:

So this all sounds very promising as a way of tracking changes to our Data Warehouse data, for purposes such as extracting deltas, inserts and updates to Type I and II dimensions and so forth. It doesn’t have any show-stopping overhead for the hashing operations for the sizes of data typically encountered and storage isn’t going to be an issue. It is native to T-SQL so we can rerun our hash value generation in the engine where our data resides rather than having to push through SSIS or some other tool to generate this for us. Algorithms are universal and as such will give us the same values wherever used for the same bytes of input. Let’s go back to the basic idea for a minute and consider how we implement this.

This is particularly useful in cases where you have metadata columns you don’t much care about (e.g., last modified time).  I do recommend using CONCAT or CONCAT_WS (if you’re on SQL Server 2017) to do string concatenation, though; it’d remove the need for util.CastAsNVarchar and possibly more.

Comments closed

Minecraft And DevOps

Richie Lee has an essay comparing Minecraft to DevOps:

Let me be clear about something: If you don’t have your databases in source control, there’s no point in thinking about anything else. Everything else follows on from this point. Getting your code in source control is the absolute starting point of all deployment pipelines. Some people have very strong views about whether to use git or TFS, but frankly I’m less concerned about the SVN of choice and more concerned about whether all code that is deployed is in source control. But the point is there’s no point in fretting about how to use Octopus Deploy if you haven’t got your code in source control.

The morals of this story are to crawl before you walk, and when you do learn to walk, don’t walk on lava.  I like the extended Minecraft metaphor, which sets this post off from many others of its ilk.

Comments closed

Powershell And Environment Variables

Adam Bertram explains how to use environment variables in Powershell:

Environment variables are exposed with a PowerShell drive known as “$env:”. It’s possible to browse through all of the environment variables by typing $env: at the console and hitting the tab key. This will allow you to see the names of each environment variable in alphabetical order.

The $env: drive is the recommended place to refer to any environment variables with PowerShell. However, it’s possible also to read the variables via .NET in PowerShell by using the GetEnvironmentVariable static method on the Environment class. This is essentially the same task.

Read on to see how you can use these in your scripts.

Comments closed

Data Visualization Basics

Kameerath Kareem describes a few basic visualizations and explains when you might use them:

Cumulative distribution graph is a commonly used chart type to express the performance metrics in percentile; it plots the percent of users who had performance metric greater or lesser than the threshold for the website.

The graph below shows the CDF graph for web page response time

From the CDF graph above, we see that at the 90th percentile, the web page response time of a website is 10.3 seconds. This means that 10% of the users in the time frame that the data was collected in had an overall web page load time of more than 10.3 seconds.

These are metrics as they relate to systems operations, but the general rules apply elsewhere as well.  Also, 10.3 seconds to load a webpage seems…slow.

Comments closed